Media server bounce back to Active for disk in SSO
Hi All,
We have a newly setup media servers which is showing Active for Disk only instead of Active for Disk & Tape . SSO licence is active on the Media server
While configuration it shows the drive path and drives are visible from the command "vmoprcmd -d " for few seconds and then the drives go invisible as given below :
D:\Veritas\Volmgr\bin>vmoprcmd -d
PENDING REQUESTS
<NONE>
DRIVE STATUS
Drv Type Control User Label RecMID ExtMID Ready Wr.Enbl. ReqId
ADDITIONAL DRIVE STATUS
Drv DriveName Shared Assigned Comment
D:\Veritas\Volmgr\bin>
We tried to force it from the command , but no use :
nbemmcmd -updatehost -machinename client_name -machinestateop set_tape_active -machinetype media -masterserver <Master server>
NBU Version: 7.5.0.6
OS : win2008 R2
I am attaching the LTID debug log.
There are connectivity errors.
Media servers can go offline for tape (active for disk only) for many reasons including the following:
- Master failed to receive the hearbeats.
- Drive polling fails.
If ltid is exiting it's because it can't detect any devices on the system.
In this case ltid starts up, but detects a change in machine state:
17:36:15.121 [904.1212] <4> avrd: INITIATING
17:36:15.121 [6968.4420] <4> SendEmmHeartbeat: Detected change in MachineState...
17:36:15.121 [6968.4420] <4> SendEmmHeartbeat: ...clearing LOCAL_CONTROL bits
So apparently the heartbeats failed.
Every 5 minutes the media server will attempt to send the heart beat again... Every time after 17:36 - it Detects a change in machine state and tries to clear local bits and up the libraries.
At this point all updates fail:
18:06:15.442 [6968.4420] <16> emmlib_SetRobotStatus: (0) UpdateLibrary failed, emmError = 2007031, nbError = 0
18:06:15.442 [6968.4420] <16> ROBOT_CONNECT: (-) Translating EMM_ERROR_SQLNoDataFound(2007031) to 328 in the device management context
18:06:15.442 [6968.4420] <16> ROBOT_CONNECT: emmlib_SetRobotStatus() failed for robot 3 with error 328
18:06:15.442 [6968.4420] <4> SendEmmHeartbeat: Going Active failed - reset machine state.
I would suggest you check the following:
WINS or DNS server issues
A NIC going bad.
You might want to schedule a bptestnetconn every 10 minutes or so to see when the media sever is not available.
I woul suggest enabling the folloiwng logigng levels on the media server:
vxlogcfg -a -p 51216 -o 111 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -a -p 51216 -o 156 -s DebugLevel=6 -s DiagnosticLevel=6
Make sure that the daemon directory is also available under volmgr\debug
On the maser - if space allows - run the following:
vxlogcfg -a -p 51216 -o 111 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -a -p 51216 -o 156 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -a -p 51216 -o 137-s DebugLevel=6 -s DiagnosticLevel=6
Check the resulting log files for issues connecting.
An example is in http://www.symantec.com/docs/TECH69625
*** Don't forget to remove logging when done.
Media server:
vxlogcfg -r -p 51216 -o 111 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -r -p 51216 -o 156 -s DebugLevel=6 -s DiagnosticLevel=6
Master:
vxlogcfg -r -p 51216 -o 111 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -r -p 51216 -o 156 -s DebugLevel=6 -s DiagnosticLevel=6
vxlogcfg -r -p 51216 -o 137-s DebugLevel=6 -s DiagnosticLevel=6
Good luck
After that happens this entity can no longer make any updates to the database and