cancel
Showing results for 
Search instead for 
Did you mean: 

Error 41 on backup of HP-UX IA64

Chad_Grill
Level 3
Hello,
I am at my wits end with this one.  I have an HP Itanium server running HP-UX 11.31 that I cannot get a complete backup of.  Right at the end of the job, I get an error 41 - network connection timed out.  I know that neither the Master server running the backup nor the HP-UX client are having network issues (a ping goes interrupted and people are logged onto the HP-UX server all day via telnet).  Althought the job has gone through it's retries and aborted on the NBU server, I can still see that the job is running on the client.  I'm running Netbackup Server 6.5 on W2K3. Only noteworthy log entry I found was this:

bpbrm (server) -
13:58:48.021 [5228.9344] <2> bpbrm readline: bpbrm timeout after 1000 seconds
13:58:48.021 [5228.9344] <2> bpbrm kill_child_process_Ex: start
13:58:49.459 [5228.9344] <2> bpbrm wait_for_child: start
13:58:49.459 [5228.9344] <2> bpbrm wait_for_child: child exit_status = 150
13:58:54.599 [5228.9344] <2> bpbrm Exit: client backup EXIT STATUS 41: network connection timed out

Any help would be GREATLY appreciated.

Thank You
1 ACCEPTED SOLUTION

Accepted Solutions

Nicolai
Moderator
Moderator
Partner    VIP   

FIX : Add IGNORE_XATTR = YES to bp.conf

There are also some EEB around to fix the problem. Contact Symantec support to obtain them.

Other articles about the subject:

Netbackup client hang on HP-UX 11.31

or

Backup fails with status 13 when bpbkar hangs processing /dev files on VxFS 5.0. ...


View solution in original post

14 REPLIES 14

Chad_Grill
Level 3
Oh, and before anyone asks, I have tried increasing the timeout to the max.  Had no effect. :(

Marianne
Level 6
Partner    VIP    Accredited Certified
What process do you see still running on the client?
I wonder if you could possibly have the issue reported by  'veritas-bu' list members:
http://mailman.eng.auburn.edu/pipermail/veritas-bu/2009-January/102703.html


Chad_Grill
Level 3
However I don't have that patch installed.

Marianne
Level 6
Partner    VIP    Accredited Certified
Create bpbkar log directory on the client before you run the next backup.
See if any errors are reported in the log after backup failure.
The bpbrm error in your 1st post tells me that the media server has not received any data from the client in the last 1000 seconds.
Check bptm log on the media server for details about data received from client.

Chad_Grill
Level 3
Here's the only error I get in the log:

11:59:00.194 [21640] <16> bpbkar sighandler: ERR - bpbkar killed by SIGPIPE
11:59:00.194 [21640] <2> bpbkar sighandler: INF - ignoring additional SIGPIPE signals
11:59:00.194 [21640] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 40: network connection broken
11:59:00.194 [21640] <4> bpbkar Exit: INF - EXIT STATUS 40: network connection broken
11:59:00.194 [21640] <2> bpbkar Exit: INF - Close of stdout complete

Also, I was able to get some files backed up when I selected particular directories instead of doing everything from the root down.

Stumpr2
Level 6
what does traceroute display?
also from master or media server please run
bptestbpcd -client <clientname> -verbose

Chad_Grill
Level 3
Output from bptestbpcd:

c:\Program Files\VERITAS\NetBackup\bin\admincmd>bptestbpcd -client hp1 -verbose
1 1 1
10.200.20.222:1529 -> 10.200.90.11:13724
10.200.20.222:1530 -> 10.200.90.11:13724
PEER_NAME = mimic
HOST_NAME = hp1
CLIENT_NAME = hp1
VERSION = 0x06500000
PLATFORM = hpia64
10.200.20.222:1531 -> 10.200.90.11:13724

Marianne
Level 6
Partner    VIP    Accredited Certified
Have a look at the lines in bpbkar log preceding the ones that you've posted.
It could possibly be the problem described in this TechNote (although the status code is different):
http://seer.entsupport.symantec.com/docs/276429.htm
The bpbkar daemon hangs while processing HP-UX pwgrd socket files (/var/spool/sockets/pwgr/*).
One possible solution is to create exclude_list for the socket files.

Chad_Grill
Level 3
I found multiple socket files, which I put in the exclude_list, but it still failed.

Chad_Grill
Level 3
After increasing the log level, I found that backup stops at the same spot every time:

17:42:02.996 [6133] <4> bpbkar PrintFile: /dev/syscon
17:42:02.996 [6133] <4> bpbkar PrintFile: /dev/systty
17:42:02.996 [6133] <4> bpbkar PrintFile: /dev/tcp
17:42:02.996 [6133] <4> bpbkar PrintFile: /dev/tcp6
17:42:02.996 [6133] <4> bpbkar PrintFile: /dev/telnetm
17:42:02.997 [6133] <4> bpbkar PrintFile: /dev/tty
17:42:02.997 [6133] <4> bpbkar PrintFile: /dev/tty3p0

I tried excluding /dev/tty3p0, but it still hangs (only it tells me that the file is being skipped in the bpbkar log).

Chad_Grill
Level 3
Since it appeared to be hanging in the /dev directory we decided to exclude it from the backups.  Not a big deal for us as we create makerecovery tapes on a regular basis.

Thanks everyone for your help.

Marianne
Level 6
Partner    VIP    Accredited Certified
why not exclude entire /dev? I cannot see any need to ever restore /dev.
I also remember something about /proc that needs to be excluded - can't remember off-hand which O/S...

Andy_Welburn
Level 6
NetBackup automatically excludes the following file system types on most platforms:

■ cdrom (all UNIX platforms)
■ cachets (AIX, Solaris, SGI, UnixWare)
■ devpts (Linux)
■ mntfs (Solaris)
■ proc (UNIX platforms; does not exclude automatically for AIX, so /proc must be added manually to the exclude list. If /proc is not added manually, partially successful backups may result with the ALL_LOCAL_DRIVES directive on AIX)
■ tmpfs (Linux)
■ usbdevfs (Linux)

Also:
DOCUMENTATION: Using the ALL_LOCAL_DRIVES policy directive or the root directory "/" as the Backup S...


Nicolai
Moderator
Moderator
Partner    VIP   

FIX : Add IGNORE_XATTR = YES to bp.conf

There are also some EEB around to fix the problem. Contact Symantec support to obtain them.

Other articles about the subject:

Netbackup client hang on HP-UX 11.31

or

Backup fails with status 13 when bpbkar hangs processing /dev files on VxFS 5.0. ...