09-11-2012 07:48 PM
We have one site with 7 FT media server, recent days many jobs failed with status code 83 in one of those media server, the log message is like this:
09/12/2012 02:00:00 - Info nbjm (pid=5246) starting backup job (jobid=24220) for client cbp52a, policy cbp52a.memDB, schedule Full
09/12/2012 02:09:41 - Info bpbrm (pid=27190) telling media manager to start backup on client
09/12/2012 02:09:41 - Info bptm (pid=27192) using 262144 data buffer size
09/12/2012 02:09:41 - Info bptm (pid=27192) using 16 data buffers
09/12/2012 02:09:42 - Error bptm (pid=27192) Could not open FT Server pipe: pipe open failed (17)
09/12/2012 02:09:42 - Info bptm (pid=27192) EXITING with status 83 <----------
09/12/2012 02:09:42 - Info bpbrm (pid=27190) got ERROR 83 from media manager
09/12/2012 02:10:28 - Info nbjm (pid=5246) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=24220, request id:{5910287C-FC44-11E1-80DF-08980C2EE471})
09/12/2012 02:10:28 - requesting resource Any
09/12/2012 02:10:28 - requesting resource nbumaster.NBU_CLIENT.MAXJOBS.cbp52a
09/12/2012 02:10:28 - requesting resource nbumaster.NBU_POLICY.MAXJOBS.cbp52a.memDB
09/12/2012 02:10:31 - granted resource nbumaster.NBU_CLIENT.MAXJOBS.cbp52a
09/12/2012 02:10:31 - granted resource nbumaster.NBU_POLICY.MAXJOBS.cbp52a.memDB
09/12/2012 02:10:31 - granted resource 00A087
09/12/2012 02:10:31 - granted resource Drive001
09/12/2012 02:10:31 - granted resource nbumedia1-hcart3-robot-tld-0
09/12/2012 02:10:31 - granted resource TRANSPORT
09/12/2012 02:10:32 - estimated 14014803 kbytes needed
09/12/2012 02:10:32 - Info nbjm (pid=5246) started backup job for client cbp52a, policy cbp52a.memDB, schedule Full on storage unit nbumedia1-hcart3-robot-tld-0
09/12/2012 02:10:32 - started process bpbrm (pid=27190)
09/12/2012 02:10:34 - end writing
media open error (83)
All the 7 media servers are with the same platform, each media server has 3 HBAs for FT services, and each HBA connected with 6-12 clients. Each client has 2-3 policy, and jobs run at the same time may reach up to 15 per media server. I noticed that during one backup period, 12 jobs are started in that media, only 3 successful started, others failed with status 83, the Devices-Media Servers-View FT Conncetions shows only 3 connections.
It's strange that the error occurs in random client, even one job failed with status 83, after an hour another job in the same client works fine.
Some configurations are as follows:
NetBackup Version: 7.1
OS(both media adn client): SUSE Linux Enterprise Server 10 (x86_64)
Storage Unit: VTL with 8 drivers, maximum streams per drive is set to 4
Maximum concurrent FT connections 12
NUMBER_DATA_BUFFERS 16
SIZE_DATA_BUFFERS 262144
# /usr/openv/netbackup/bin/admincmd/nbftconfig -getconfig
u nbumaster 2 4 preferred 5 5
09-12-2012 02:00 AM
09-12-2012 02:10 AM
09-12-2012 04:47 AM
Are you select Limit job per policy in policy atttribute?
please unselect it or change the number to 15
09-12-2012 06:13 AM
09-12-2012 06:42 AM
Try NUMBER_DATA_BUFFERS_FT with value of 32.
The case that prompted TECH71062 was solved with value of 32.
09-12-2012 04:47 PM
09-12-2012 10:46 PM
Maybe time to log a support call?