05-25-2014 03:18 AM
Hello everyone ,
I have some database backups failing with code 6 and the filesystem backup for the same clients is fine !
EXIT STATUS = 6: the backup failed to back up the requested files
any idea about my issue ?
Thanks
05-25-2014 04:31 AM
Status 6 is a generic error say that the the backup has failed.
You will agree that the error code itself does nor give us anything to work with.
Please start by telling us more about the database backup.
Which type of database?
DB type and version on client?
OS on client?
OS on master?
OS on media server?
(Unix is way too generic...)
Please show us all text in Details tab of the failed job as well as db output file on the client.
If filesystem backup is working, it means that comms between media server and client is fine.
Db backup needs comms between client and master.
Port connectivity on port 1556 is needed in both directions as well as forward and reverse name lookup between master and client.
Have you checked that?
05-25-2014 05:04 AM
client AIX 6.1
master AIX 6.1
media redhat
I checked comm all fine between master/media to client both ways
5/23/2014 7:17:32 PM - Info nbjm(pid=21233842) starting backup job (jobid=8467298) for client autom, policy SAP_POICY_1, schedule INCR
5/23/2014 7:17:32 PM - Info nbjm(pid=21233842) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=8467298, request id:{BEAE2F9C-E295-11E3-BAA6-7407D4050000})
5/23/2014 7:17:32 PM - requesting resource STU1
5/23/2014 7:17:32 PM - requesting resource nbumaster.NBU_CLIENT.MAXJOBS.autom
5/23/2014 7:17:32 PM - requesting resource nbumaster.NBU_POLICY.MAXJOBS.SAP_POICY_1
5/23/2014 7:17:34 PM - granted resource nbumaster.NBU_CLIENT.MAXJOBS.autom
5/23/2014 7:17:34 PM - granted resource nbumaster.NBU_POLICY.MAXJOBS.SAP_POICY_1
5/23/2014 7:17:34 PM - granted resource 000296
5/23/2014 7:17:34 PM - granted resource drive_12
5/23/2014 7:17:34 PM - granted resource STU1
5/23/2014 7:17:41 PM - Info bptm(pid=7955) using 65536 data buffer size
5/23/2014 7:17:41 PM - Info bpbrm(pid=7942) autom is the host to backup data from
5/23/2014 7:17:41 PM - Info bpbrm(pid=7942) telling media manager to start backup on client
5/23/2014 7:17:41 PM - estimated 0 Kbytes needed
5/23/2014 7:17:41 PM - Info nbjm(pid=21233842) started backup (backupid=autom_1400861861) job for client autom, policy SAP_POICY_1, schedule INCR on storage unit STU1
5/23/2014 7:17:42 PM - Info bptm(pid=7955) using 12 data buffers
5/23/2014 7:17:42 PM - Info bpbrm(pid=7942) sending bpsched msg: CONNECTING TO CLIENT FOR autom_1400861861
5/23/2014 7:17:42 PM - Info bpbrm(pid=7942) spawning a brm child process
5/23/2014 7:17:42 PM - Info bpbrm(pid=7942) listening for client connection
5/23/2014 7:17:42 PM - Info bpbrm(pid=7942) child pid: 12532
5/23/2014 7:17:42 PM - connecting
5/23/2014 7:17:43 PM - Info bpbrm(pid=7942) INF - Client read timeout = 1200
5/23/2014 7:17:43 PM - Info bpbrm(pid=7942) accepted connection from client
5/23/2014 7:17:43 PM - connected; connect time: 0:00:01
5/23/2014 7:32:45 PM - Error bpbrm(pid=12532) client autom EXIT STATUS = 6: the backup failed to back up the requested files
5/23/2014 7:32:45 PM - Info dbclient(pid=0) done. status: 6
5/23/2014 7:32:45 PM - Info bpbrm(pid=7942) sending message to media manager: STOP BACKUP autom_1400861861
5/23/2014 7:32:46 PM - Info bpbrm(pid=7942) media manager for backup id autom_1400861861 exited with status 150: termination requested by administrator
5/23/2014 7:32:46 PM - end writing
the backup failed to back up the requested files(6)
05-25-2014 05:39 AM
If we look at job details, it seems to be a SAP backup, but with dbclient failing, it seems to be Oracle policy type?
Please confirm policy type and which kind of script on the client - rman or backint script?
Did you do database to NBU linking on the client as per appropriate manual (SAP or Oracle)?
All comms between master, media and client seems to be fine, but the script on the client is failing.
We need to see output of the backup script on the client.
Please create dbclient folder on the client under /usr/openv/netbackup/logs to troubleshoot from NBU point of view.
Remember to 'chmod 777 dbclient'.
05-25-2014 11:45 AM
I agree iwth Marianne. It would be useful to see the dbclient log.
The status 6 is due to a time out. If you look, you can see that there is a 15 minute gap in the messages you copied in:
5/23/2014 7:17:42 PM - Info bpbrm(pid=7942) child pid: 12532
5/23/2014 7:17:42 PM - connecting
5/23/2014 7:17:43 PM - Info bpbrm(pid=7942) INF - Client read timeout = 1200
5/23/2014 7:17:43 PM - Info bpbrm(pid=7942) accepted connection from client
5/23/2014 7:17:43 PM - connected; connect time: 0:00:01
[gap]
timeout from the client.
5/23/2014 7:32:45 PM - Error bpbrm(pid=12532) client autom EXIT STATUS = 6: the backup failed to back up the requested files
5/23/2014 7:32:45 PM - Info dbclient(pid=0) done. status: 6
5/23/2014 7:32:45 PM - Info bpbrm(pid=7942) sending message to media manager: STOP BACKUP autom_1400861861
5/23/2014 7:32:46 PM - Info bpbrm(pid=7942) media manager for backup id autom_1400861861 exited with status 150: termination requested by administrator
5/23/2014 7:32:46 PM - end writing
I would actually suggest enabling the bptm, bpbrm and bpcd logs too (on the media server bptm/bpbrm, client bpcd along with dbclient).
Additionally, I would not discount the communication type issues. It's not uncommon for reverse and forward lookup to return different names/ips. Unfortunately I have heard a number of times that everything is 'ok' with communication just to find out otherwise when detailed logs are reviewed..
Let us know what you find. If you don't understand this information it might not be a bad idean to open a case.
05-26-2014 01:34 AM
Addition also create the bphdb log on the client, it contents can often help to identify the problem
As always it is useful to run the database backup script in a prompt to see if there is any messages no caught by the logs.
05-26-2014 11:38 PM
Some streams are successful and other are failing with cod 6 from the same job. This indicate the comm is fine
Does it indicate any configuration or timeout issue ?
the issue is happening on both RMAN and backinit backups ?
05-27-2014 12:28 AM
How many streams do you have coming in at a time (How many channels in Oracle terms)? Are you seeing a pattern in the failures (for e.g during heavy load on the master/media server)? Have you tried reducing the number of channels in the RMAN script?
The logs that were requested by the other forum members will be very helpful for digging further on the cause of the failure(s).
05-27-2014 01:49 AM
The default Client Connect and Client Read Timeout is 5 minutes.
This may be fine for smaller databases, but for larger backups this is not sufficient.
We normally chance both these timeouts to 1800 on the media server(s).
The logs will confirm. Especially bpbrm on the media server.