netbackup jobs are in queue but tape drives still ...

T_N · ‎11-07-2006

Hi All,

# vmoprcmd

PENDING REQUESTS

DRIVE STATUS

Drv Type Control User Label RecMID ExtMID Ready Wr.Enbl. ReqId
0 hcart2 ACS root Yes 607735 607735 Yes Yes 2
1 hcart2 ACS root Yes 607395 607395 Yes Yes 1
2 hcart2 ACS - No - -

ADDITIONAL DRIVE STATUS

Drv DriveName Shared Assigned Comment
0 ssl_01311 Yes ssldbrn2-0
1 ssl_0139 Yes ssldbrn2-0
2 ssl_01313 Yes -
#

# netstat -a | grep bpcd
*.bpcd *.* 0 0 49152 0 LISTEN
ssldbrn2-0.bpcd cdcnbu1-20.676 33580 0 49640 0 ESTABLISHED
ssldbrn2-0.uuidgen ssldbrn2-0.bpcd 49152 0 32768 0 ESTABLISHED
ssldbrn2-0.bpcd ssldbrn2-0.uuidgen 32768 0 49152 0 ESTABLISHED
ssldbrn2-0.bpcd cdcnbu1-20.522 33580 0 49108 0 ESTABLISHED
ssldbrn2-0.786 ssldbrn2-0.bpcd 49152 0 32768 0 ESTABLISHED
ssldbrn2-0.bpcd ssldbrn2-0.786 32768 0 49152 0 ESTABLISHED
ssldbrn2-0.bpcd cdcnbu1-20.1001 33580 0 49037 0 ESTABLISHED
ssldbrn2-0.739 ssldbrn2-0.bpcd 49152 0 32768 0 ESTABLISHED
ssldbrn2-0.bpcd ssldbrn2-0.739 32768 0 49152 0 ESTABLISHED
ssldbrn2-0.bpcd cdcnbu1-20.552 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.879 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.772 33580 0 49640 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.1023 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.565 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.563 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.995 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.897 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.828 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.841 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.561 33580 0 49509 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.914 33580 0 49479 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.677 33580 0 49478 0 TIME_WAIT
ssldbrn2-0.bpcd cdcnbu1-20.779 33580 0 49479 0 TIME_WAIT

I have a netbackup master (Unix) and 39 media servers (unix), they're running on Solaris 8,9,10. one of my media servers is ssldbrn2, it has 3 tape drives, some reason, backup go to 2 tape drives and the rest is available. When I start a backup test, my backup job is on queue.

I check on bpcd port, it has couple bpcd processess are TIME_WAIT , anyone knows how could I fix this problem ? Thanks.

Dennis_Strom · ‎11-07-2006

on the storage unit for that library do what do you have the max jobs set to?

zippy · ‎11-07-2006

Tom,

Make sure the Drives are up first.

TIME_WAIT is not a bad thing, FIN_WAIT ACK_WAIT now they are bad.

http://www.unixguide.net/network/socketfaq/2.7.shtml

From your vmoprcmd command it appears that tapes are in the drive but the CONTROL Feild is at ACS, if the Robot were working the CONTROL should say TLD.

I would open the library pop all the tapes out of the drives, and restart netbackup, then runn bpps -a. Make sure all is running, then kick aff the jobs again.

JD

Root@#bpps -a
NB Processes
------------
root 19984 1 0 Oct 27 ? 1:39 /usr/openv/netbackup/bin/bpdbm
root 19950 1 0 Oct 27 ? 9:28 /usr/openv/netbackup/bin/bprd
root 20141 20043 0 Oct 27 ? 0:07 /usr/openv/netbackup/bin/bpjava-susvc root -1 -1 english /usr/o
root 20046 20043 0 Oct 27 ? 0:00 /usr/openv/netbackup/bin/bpjava-susvc root -1 -1 english /usr/o
root 19985 19984 0 Oct 27 ? 3:09 /usr/openv/netbackup/bin/bpjobd
root 20048 20043 0 Oct 27 ? 0:08 /usr/openv/netbackup/bin/bpjava-susvc root -1 -1 english /usr/o
root 20043 1 0 Oct 27 ? 0:00 /usr/openv/netbackup/bin/bpjava-susvc root -1 -1 english /usr/o
root 19749 19727 0 Oct 27 ttyp1 575:53 /veritas/openv/java/jre/bin/PA_RISC2.0/java -Dvrts.NBJAVA_CONF=

MM Processes
------------
root 20000 1 0 Oct 27 ? 0:14 tldcd
root 19945 1 0 Oct 27 ? 1:20 vmd
root 19937 1 0 Oct 27 ? 0:23 /usr/openv/volmgr/bin/ltid
root 19995 19937 0 Oct 27 ? 8:33 avrd
root 19992 19937 0 Oct 27 ? 0:04 tldd

# vmoprcmd

PENDING REQUESTS

DRIVE STATUS

Drv Type Control User Label RecMID ExtMID Ready Wr.Enbl. ReqId
0 hcart TLD - No - -
1 hcart TLD - No - -
2 hcart TLD - No - -
3 hcart TLD - No - -
4 hcart TLD - No - -
5 hcart DOWN-TLD - No - -
6 hcart TLD - No - -
7 hcart TLD - No - -

ADDITIONAL DRIVE STATUS

#Cronjob for Tape up or tape down state.
#check for down tape drives
#* * * * * /usr/local/bin/drive_ck > /dev/null 2>&1

######## Script ##########

#!/bin/sh
#created by jim d
#and to bring a Downed drive backup
#use to check for down tape drives
MASTER=`hostname`
set -vx
alias openfile=exec
alias closefile=exec
DRVL=/tmp/TD

if /usr/openv/volmgr/bin/tpconfig -d | grep DOWN
then
/usr/bin/touch /tmp/TD
#echo "!! $MASTER has a downed tapedrive !! `date`" | mail tapedrive
fi
/usr/openv/volmgr/bin/vmoprcmd | grep DOWN-TLD | awk '{print $1}'> /tmp/TD
openfile 5<$DRVL
while read -u5 -r DRIVELIST
do
#/usr/openv/volmgr/bin/vmoprcmd -up $DRIVELIST
done
closefile 5<&-

T_N · ‎11-07-2006

on Storage unit: I set MPX : 12, max concurrent drives: 3,

I count 21 backup jobs throught to this media server (ssldbrn2) (17 are writing and 4 jobs are connecting)

T_N · ‎11-07-2006

Hi Jame,

My drives, they're up. I created a script that's running on master server, checks evry 15 mins, if any drive down, it'll bring it up automatic and send me a email. I doubt maybe bpcd port problem ? how could I refresh bpcd port on Solaris 8 &9 ? I know how could refresh bpcd port on Solaris 10. Thanks.

VOX

netbackup jobs are in queue but tape drives still available