cancel
Showing results for 
Search instead for 
Did you mean: 

AIX(6.1) client backup ( /app is the backup selection ) has been failed with error 13.

A_3
Level 4
Certified

Post media mount we get below error,

-----------------------------------------
07/30/2014 02:10:58 - begin writing
07/30/2014 02:33:26 - Error bpbrm (pid=13762766) socket read failed: errno = 119 - System call timed out
07/30/2014 02:33:26 - Info bpbrm (pid=8519918) sending message to media manager: STOP BACKUP rmsdbtrnrk1p_1406711401

Original backup selection is "All local drives". Except /app remaining mount points got success.

When i create new policy with backup selection /app* then backup got success.

16 REPLIES 16

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

what is the client read timout value that you have.. 

if that is default 300, try with 900 sec and see how it goes

A_3
Level 4
Certified

Yes before we tried by increasing but still it is failing. But remaining mount points are success.

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

keep the verbose as 5 in the client make sure bpbkar log directory is created in /usr/openv/netbackup/logs/

then trigger the backup job with enabled Mulitstream and let it failed.

then attach the bpbkar log and failed job detail status to this post.

 

also let us know the netbackup version installed on client, Master and media servers

INT_RND
Level 6
Employee Accredited

You refer to /app as a "mount point". If it is a remote file system then it won't be backed up without the policy option "Cross mount points".

It is recommended to perform the backup directly from the file server that hosts the remote share to avoid excessive network strain and unnecessary system usage.

REFERENCE ARTICLE:

Cross mount points (policy attribute)

http://www.symantec.com/business/support/index?page=content&id=HOWTO86616

A_3
Level 4
Certified

@Nagalla: Master/Media/Client ---->7.5.0.7

Today retriggred the same. After 1GB backup has been failed.

 

Any way i will share the requested logs.

A_3
Level 4
Certified

07/30/2014 21:14:46 - Info nbjm (pid=7977) starting backup job (jobid=2238343) for client Client_abc-bkup, policy ux_std_1500_sat_1p, schedule daily_ux

07/30/2014 21:14:46 - Info nbjm (pid=7977) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=2238343, request id:{3519F388-1869-11E4-B14F-6F0C1C7110A7})

07/30/2014 21:14:46 - requesting resource Mediaserver_def-tld0

07/30/2014 21:14:46 - requesting resource Master_server.NBU_CLIENT.MAXJOBS.Client_abc-bkup

07/30/2014 21:14:46 - requesting resource Master_server.NBU_POLICY.MAXJOBS.ux_std_1500_sat_1p

07/30/2014 21:14:47 - granted resource  Master_server.NBU_CLIENT.MAXJOBS.Client_abc-bkup

07/30/2014 21:14:47 - granted resource  Master_server.NBU_POLICY.MAXJOBS.ux_std_1500_sat_1p

07/30/2014 21:14:47 - granted resource  501221

07/30/2014 21:14:47 - granted resource  IBM.ULTRIUM-TD5.011

07/30/2014 21:14:47 - granted resource  Mediaserver_def-tld0

07/30/2014 21:14:54 - estimated 0 kbytes needed

07/30/2014 21:14:54 - Info nbjm (pid=7977) started backup (backupid=Client_abc-bkup_1406780094) job for client Client_abc-bkup, policy ux_std_1500_sat_1p, schedule daily_ux on storage unit Mediaserver_def-tld0

07/30/2014 21:15:04 - Info bpbrm (pid=11206672) Client_abc-bkup is the host to backup data from

07/30/2014 21:15:04 - Info bptm (pid=18939966) using 262144 data buffer size

07/30/2014 21:15:04 - Info bpbrm (pid=11206672) telling media manager to start backup on client

07/30/2014 21:15:04 - Info bptm (pid=18939966) using 32 data buffers

07/30/2014 21:15:04 - connecting

07/30/2014 21:15:05 - Info bpbrm (pid=11206672) spawning a brm child process

07/30/2014 21:15:05 - Info bpbrm (pid=11206672) child pid: 15663198

07/30/2014 21:15:05 - connected; connect time: 0:00:00

07/30/2014 21:15:06 - Info bpbrm (pid=11206672) sending bpsched msg: CONNECTING TO CLIENT FOR Client_abc-bkup_1406780094

07/30/2014 21:15:06 - Info bptm (pid=15991036) setting receive network buffer to 262144 bytes

07/30/2014 21:15:07 - Info bpbrm (pid=11206672) start bpbkar on client

07/30/2014 21:15:07 - Info bpbkar (pid=7012700) Backup started

07/30/2014 21:15:07 - Info bpbrm (pid=11206672) Sending the file list to the client

07/30/2014 21:15:38 - mounted 501221

07/30/2014 21:15:38 - positioning 501221 to file 2145

07/30/2014 21:16:35 - positioned 501221; position time: 0:00:57

07/30/2014 21:16:35 - begin writing

07/30/2014 21:38:42 - Error bpbrm (pid=15663198) socket read failed: errno = 119 - System call timed out

07/30/2014 21:38:42 - Info bpbrm (pid=11206672) sending message to media manager: STOP BACKUP Client_abc-bkup_1406780094

07/30/2014 21:38:49 - end writing; write time: 0:22:14

07/30/2014 21:38:51 - Info bpbrm (pid=11206672) media manager for backup id Client_abc-bkup_1406780094 exited with status 150: termination requested by administrator

07/30/2014 21:45:56 - Info bpbrm (pid=18940004) starting bptm

07/30/2014 21:45:56 - Info bpbrm (pid=18940004) Started media manager using bpcd successfully

file read failed  (13)

 

Marianne
Level 6
Partner    VIP    Accredited Certified

We need the bpbkar log on the client that logged pid 7012700

Please copy the log to bpbkar.txt and upload as File extention.

A_3
Level 4
Certified

I have uncheck the NFS and run the backup. Now /app backup got success. But /u06 and /u07 backup is getting hung/failed after writing some data. TCP/IP tiimeout also incresed.

 

I am attaching the bpbkar logs..

Marianne
Level 6
Partner    VIP    Accredited Certified

I have not had a look at bpbkar log yet, but it seems that the problem is not with NFS or with /app specifically, otherwise backup of /app alone in backup selection would not have worked.

How many filesystems are you trying to backup simultaneously on this client? What is max jobs per client set to in Master server Global properties?

I am thinking that the problem could be with load on the client. Try to limit max jobs for this client to 1 or 2 and see what happens.
You can do this in Host Properties -> Master -> Client Attributes.
Add or select this client and specify max jobs for this client.

A_3
Level 4
Certified

@Marianne:

Thanks for your quick help.

Total 8 filesystems are running. But i put only /u06 and /u07. Have you said same has been done and initiated backup. I will update on result.

 

Regards,

Ajay Kumar

A_3
Level 4
Certified

Again backup is in hung state. After writing 2 GB.

Marianne
Level 6
Partner    VIP    Accredited Certified

What is happening on the client?

Is bpbkar still getting updated?

Is bpbkar process still running?

Can you check resources (memory, cpu) on the client?

A_3
Level 4
Certified

CPU and all the things are normal.

In the last communication i have attached the bpbkar logs.

 

 

INT_RND
Level 6
Employee Accredited
 

You are experiencing intermittant network failures. Take a look at the last few lines of that log you posted:
 

00:56:22.532 [62456050] <16> bpbkar sighandler: ERR - bpbkar killed by SIGPIPE
00:56:22.532 [62456050] <2> bpbkar sighandler: INF - ignoring additional SIGPIPE signals
00:56:22.532 [62456050] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 40: network connection broken

 

Perfom a thourough analysis of the network path. You might want to do some performance tuning along the path and increase timeout values.

Official TECHNOTE for ERR 40:

http://www.symantec.com/business/support/index?page=content&id=TECH66916

Marianne
Level 6
Partner    VIP    Accredited Certified

I think if we look at bpbrm log we will see a timeout. So, the status 40 on the client site is probably because bpbrm terminated the backup due to no response from the client.

We see a 20-minute gap between these 2 entries:

00:34:19.079 [34472204] <4> bpbkar PrintFile: /u07/exports/
00:56:22.532 [62456050] <16> bpbkar sighandler: ERR - bpbkar killed by SIGPIPE

Logging level looks like it is still at 0.

Nagalla asked 2 days ago to increase logging level to 5. 
We need to know what is happening on the client in these 20 minutes.

Please increase client's logging level:
Add VERBOSE = 5 in /usr/openv/netbackup/bp.conf
before trying backup for these 2 mount points again.

If I was troubleshooting this issue, I would also 'tail -f' the bpbkar log on the client in one window, and in another window run something like 'top' while the backup is running. 

The problem is on the client. We need to know what is happening on the client that makes the backup go into hang state.

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

along with bpbkar log with Verbose 5 for those 2 mount points 

also run the df -k command and see if that ends fine.

also run ls -R /u07/exports add see how much time it is taking to compleate the command.. it should not take more than 20 min.