cancel
Showing results for 
Search instead for 
Did you mean: 

Library Issue and Media Server

H_Sharma
Level 6

Hi Experts,

We have One master server 7.6 in Windows 2008 and 2 Media servers are in Solaris in our envionment. 8 Drives are shared in All the three servers.

I dont know its shared it or not but it shows 3 paths when we click on one drive one for master and two for media server.

1:- I need to know where should we enable the bptm logs for drive issue etc?

2:- What is the use of sharing these drives to 3 servers. Is SSO licence used to share the drive here?

3:- Drive path gets down daily but library console shows every drive is up. so what should be done here? is it imapacting something ? so error 98 code is related to it?

4:-  what is the role of the 2 media servers here?

I am unable to understand my environment so pls let me know.

Thanks so much,

2 ACCEPTED SOLUTIONS

Accepted Solutions

mph999
Level 6
Employee Accredited


1:- I need to know where should we enable the bptm logs for drive issue etc?

2:- What is the use of sharing these drives to 3 servers. Is SSO licence used to share the drive here?

3:- Drive path gets down daily but library console shows every drive is up. so what should be done here? is it imapacting something ? so error 98 code is related to it?

4:-  what is the role of the 2 media servers here?


1/
bptm is enabled on the media server.

Add VERBOSE_BPTM = 5 on media server, create dir /usr/openv/netbackup/logs/bptm


For status 98, I would also enable:

Add VERBOSE to the /usr/openv/volmgr/vm.conf file.  If this file does not exist, just create it.
If necessary, create the directory /usr/openv/volmgr/debug
 

mkdir /usr/openv/volmgr/debug/robots
mkdir /usr/openv/volmgr/debug/ltid
mkdir /usr/openv/volmgr/debug/reqlib
mkdir /usr/openv/volmgr/debug/tpcommand
 
The following empty files increase the details logged further
 
touch /usr/openv/volmgr/DRIVE_DEBUG
touch /usr/openv/volmgr/ROBOT_DEBUG 

 
Restart ltid :
 
/usr/openv/volmgr/bin/stopltid
/usr/openv/volmgr/bin/ltid -v

This need to be done on both the media serevr with the tape drive, and, the robot control host if that is a separate server.

Activity monitor for a failing job is useful, as is the system messages log.

2/ SSO is used, drives are shared to reduce cost, one drive can be used by say two servers, instead of have 2 drives (one on each).

3/ Library console is separate to NBU, just becuase the library console shows the drive up doesn't mean there is, or is not a fault on the hardware side.  More investigation is required.

4/  No idea, it's your environment ....  Generally, multiple media servers are used if clients are in differnt location, or if the number of clients is too great for one media server on its own.

View solution in original post

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

For the Windows Master server, do the same:

bptm log folder needs to be created under <install-path>\veritas\netbackup\logs

vm.conf will be in <install-path>\veritas\volmgr
Create debug folder under volmgr with same list of folders as on the Solaris media servers.
For the 'touch files', create empty text file with no extention (take care that Windows explorer does not add .txt)
Restart Device Management service after adding VERBOSE entry and adding log folders.

About DOWN drives - there can be many reasons.
One of the most common reasons in SSO environment is incorrect drive mapping - drive position number in the robot is linked to the wrong OS device name. This is normally caused by lack of Persistent Binding at OS level. For this you need the tools from the HBA vendor.
After you have added the VERBOSE entry to all media servers (followed by restart), the exact reason for drives going DOWN will be logged in /var/adm/messages on Solaris media servers and Event Viewer Application log.
IMPORTANT that you check which media server is DOWN'ing the drive and then check the bptm and system logs on that particular media server.

You have multiple media servers to share the backup load. 3 media servers writing to 1 tape drive each at the same time will be faster than 1 master/media server trying to write to all 3 drives at the same time.
Bear in mind that if you have 1Gb network card in each media server, data can only be received at max 100MB/sec. That is the write speed of 1 tape drive. 

SSO means that all media servers can see all the drives and can take turns in using the same tape drives. 

If you had only one drive dedicated to each media server, it would mean that a drive failure will leave the media server attached to that tape drive will have nothing to use until the tape drive is fixed.

View solution in original post

2 REPLIES 2

mph999
Level 6
Employee Accredited


1:- I need to know where should we enable the bptm logs for drive issue etc?

2:- What is the use of sharing these drives to 3 servers. Is SSO licence used to share the drive here?

3:- Drive path gets down daily but library console shows every drive is up. so what should be done here? is it imapacting something ? so error 98 code is related to it?

4:-  what is the role of the 2 media servers here?


1/
bptm is enabled on the media server.

Add VERBOSE_BPTM = 5 on media server, create dir /usr/openv/netbackup/logs/bptm


For status 98, I would also enable:

Add VERBOSE to the /usr/openv/volmgr/vm.conf file.  If this file does not exist, just create it.
If necessary, create the directory /usr/openv/volmgr/debug
 

mkdir /usr/openv/volmgr/debug/robots
mkdir /usr/openv/volmgr/debug/ltid
mkdir /usr/openv/volmgr/debug/reqlib
mkdir /usr/openv/volmgr/debug/tpcommand
 
The following empty files increase the details logged further
 
touch /usr/openv/volmgr/DRIVE_DEBUG
touch /usr/openv/volmgr/ROBOT_DEBUG 

 
Restart ltid :
 
/usr/openv/volmgr/bin/stopltid
/usr/openv/volmgr/bin/ltid -v

This need to be done on both the media serevr with the tape drive, and, the robot control host if that is a separate server.

Activity monitor for a failing job is useful, as is the system messages log.

2/ SSO is used, drives are shared to reduce cost, one drive can be used by say two servers, instead of have 2 drives (one on each).

3/ Library console is separate to NBU, just becuase the library console shows the drive up doesn't mean there is, or is not a fault on the hardware side.  More investigation is required.

4/  No idea, it's your environment ....  Generally, multiple media servers are used if clients are in differnt location, or if the number of clients is too great for one media server on its own.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

For the Windows Master server, do the same:

bptm log folder needs to be created under <install-path>\veritas\netbackup\logs

vm.conf will be in <install-path>\veritas\volmgr
Create debug folder under volmgr with same list of folders as on the Solaris media servers.
For the 'touch files', create empty text file with no extention (take care that Windows explorer does not add .txt)
Restart Device Management service after adding VERBOSE entry and adding log folders.

About DOWN drives - there can be many reasons.
One of the most common reasons in SSO environment is incorrect drive mapping - drive position number in the robot is linked to the wrong OS device name. This is normally caused by lack of Persistent Binding at OS level. For this you need the tools from the HBA vendor.
After you have added the VERBOSE entry to all media servers (followed by restart), the exact reason for drives going DOWN will be logged in /var/adm/messages on Solaris media servers and Event Viewer Application log.
IMPORTANT that you check which media server is DOWN'ing the drive and then check the bptm and system logs on that particular media server.

You have multiple media servers to share the backup load. 3 media servers writing to 1 tape drive each at the same time will be faster than 1 master/media server trying to write to all 3 drives at the same time.
Bear in mind that if you have 1Gb network card in each media server, data can only be received at max 100MB/sec. That is the write speed of 1 tape drive. 

SSO means that all media servers can see all the drives and can take turns in using the same tape drives. 

If you had only one drive dedicated to each media server, it would mean that a drive failure will leave the media server attached to that tape drive will have nothing to use until the tape drive is fixed.