I presume the issue must have

praveenu143 · ‎05-10-2010

Hi,

Some of the full backup jobs are failing with status 13 but diff backups are getting successfull. Can some one please help me ?

Andy_Welburn · ‎05-10-2010

In-depth Troubleshooting Guide for Exit Status Code 13 in NetBackup Server (tm) / NetBackup Enterpri...

& then browse through the search results for Status Code 13 here:

This Technote contains a table for NetBackup's Status Codes, with links to potential resolutions for...

Which version of NB?
Which O/S affected?
Any more details you can provide? (logs etc etc)

praveenu143 · ‎05-10-2010

NBU version :6.5.3.1
Solaris 10

error:
05/09/2010 06:04:53 - begin writing
05/09/2010 06:50:52 - Error bpbrm (pid=15691) socket read failed: errno = 131 - Connection reset by peer
05/09/2010 06:59:10 - end writing; write time: 0:54:17
file read failed (13)

nairdheeraj · ‎05-10-2010

May be you can try to reduce the cache file size for VSP to 5 GB ot 10 GB.. the default would be 30 GB. If there is no sufficient space you will encounter status 13..

You can also see by disabling VSP if you feel it is not required..

Do let me know if it helped.

praveenu143 · ‎05-10-2010

Hi,

Is there wil be VSP on solaris ?

Andy_Welburn · ‎05-10-2010

per http://seer.entsupport.symantec.com/docs/327076.htm

Are they just 'normal' filesystem backups? Or application/database?
Are there 000's of small files? (Having said that I would've thought the diffs would've failed as opposed to the fulls)

nairdheeraj · ‎05-10-2010

I didn't know that the client in question was Solaris.. Well in this case I would say you could check you duplex settings on the server and switch side.. Ideally they need to be set to 100/1000 Mpbs Full Duplex. I know in some environments that have a habit to keep it in Auto but Symantec does recommend to have it hard-corded as Full Duplex..

praveenu143 · ‎05-10-2010

OK..Let me check this

praveenu143 · ‎05-11-2010

I've check the CLIENT_READ_TIMEOUT setting on media server, the vaule is 3600 and also i've check the performance..Jobs is running with good speed at 11603 kb/sec..Full backup job is failing after backing up 3000 files.I've have tried twice but job is failing at same point..

Please help me with other options..

nairdheeraj · ‎05-11-2010

You can try to find if it is failing while backing up a particular file system or a particular file.

There could be a corrupted file or file system.

praveenu143 · ‎05-11-2010

How to find this ?

nairdheeraj · ‎05-11-2010

You can either go to the job details page and see which the last File system that it was trying to backup or else you can go the "Problem" tab under the "Reports" section and sepcify the client name so that you get a more detailed veiw on what the error was.

May be you can do one thing. Try to create a new policy for the same client and try to backup only the / (root) file system and see if the backup complete. You can try this for each of the file selections that you have in the policy to find the point of failure.

praveenu143 · ‎05-11-2010

Thanks ! Will try this

praveenu143 · ‎05-11-2010

Backup got completed for root FS..FYI..for orginal policy file selection is all_local_drives..

jim_dalton · ‎05-11-2010

Turn on logging, that'll give you what you need.
I suspect it'll either be something corrupt in the filesys that causes the job to hang, (fsck it) or theres some unpleasant data structure that takes and age to traverse . I had one with 150,000 directories and no useful data so just excluded it from backup.

Another option is dodgy client software that encounters a file with some unusual attributes.

You might also use the usual unix commands like 'find' , since if it happens for the client its probably going to happen for unix cmds.

Also consider using bpbkar, then no need to bother with tapes, much more rapid turnaround.

Jim

nairdheeraj · ‎05-11-2010

Good to hear that the / FS backup completed without errors. Now lets try specifying each file system in the backup selection instead of using all_local_drives.. This way you would be able to find the culprit that is causing the error to happen. Once you have identified the FS, put an exlude for it and try backing up the client with all_local_drive directive. I still have a strong feeling that one of you FS is having a corruption.. Another point is that I agree that we can use all_local_drives for unix clients but the problem could be when you try to backup certain file systems like /proc the backup would fail..... For unix client I always prefer to use the selections as the specific FS rather than using all_local_drives.

praveenu143 · ‎05-12-2010

I've started the backup for each FS by enabling multi streming..

nairdheeraj · ‎05-12-2010

Were you able to take a successful backup by mentioning the specific FS's as the selection.

nairdheeraj · ‎05-17-2010

I presume the issue must have resolved by now.. Do post your findings after backing up the client by specifying the individual FS.

VOX

Status 13