10-11-2016 11:39 PM
Media Manager daemons not running even after bouncing NBU services.
# ./vmoprcmd -d
oprd returned abnormal status (96)
IPC Error: Daemon may not be running
LTID debug log
02:28:41.603 [7063] <16> InitLtidDeviceInfo: Drive XXXX is ACTIVE
02:29:11.603 [7063] <4> CheckShutdownWhileInit:
Unable to release allocations for this drive. I have tried reset media server option too. It did not work.
Solved! Go to Solution.
10-12-2016 10:29 PM
10-12-2016 12:10 AM
Forget about releasing MDS allocation for now...
What is wrong with NBU daemons/processes?
Processes perhaps down because of low disk space?
Please show us output of 'bpps -x'
10-12-2016 12:25 AM
/opt/openv has 34 % free space which is 13 GB
bpps -x output :
NB Processes
------------
root 24332 1 0 02:47:58 ? 0:00 /usr/openv/netbackup/bin/bpcompatd
root 24288 1 0 02:47:54 ? 0:00 /usr/openv/pdde/pdag/bin/mtstrmd
root 24399 1 0 02:48:02 ? 0:01 /usr/openv/netbackup/bin/nbrmms
root 24117 1 0 02:47:42 ? 0:01 /usr/openv/netbackup/bin/nbdisco
root 24106 1 0 02:47:40 ? 0:00 /usr/openv/netbackup/bin/bpcd -standalone
p1wms2d2 18659 24106 0 03:22:21 ? 0:00 /usr/openv/netbackup/bin/bpcd -standalone
root 24450 1 0 02:48:09 ? 0:00 /usr/openv/netbackup/bin/nbsvcmon
root 24415 1 0 02:48:05 ? 0:05 /usr/openv/netbackup/bin/nbsl
root 24103 1 0 02:47:40 ? 0:00 /usr/openv/netbackup/bin/vnetd -standalone
MM Processes
------------
root 24314 1 0 02:47:57 ? 0:00 /usr/openv/volmgr/bin/ltid
root 24320 1 0 02:47:57 ? 0:00 vmd -v
Shared Symantec Processes
-------------------------
root 24085 1 0 02:47:34 ? 0:02 /opt/VRTSpbx/bin/pbx_exchange
10-12-2016 12:38 AM
Is this a master or media server?
NONE of the daemons/processes that should be running on a master are running here.
If this is a media server, then there is a problem with comms to nbemm on the master server.
Let us first know what we are troubleshooting here - master or media server?
10-12-2016 12:47 AM
10-12-2016 01:07 AM
Looks like some issue with EMM comms at the moment.
Or with robot control host.
Can you run these commands on the master and look for output specific to this media server?
nbemmcmd -listhosts -verbose
nbemmcmd -getemmserver
10-12-2016 02:41 AM
Command completed successfully.
nbemmcmd -listhosts -verbose
shows active disk for the problem media server
nbemmcmd -getemmserver
Media server shows correct emm server name
We have ACS and TLD drives configured on the media server. connecitivity from media server is good for robotic controller host and ACSLS server.
10-12-2016 02:48 AM
By the looks of it, you have VERBOSE entry in vm.conf (vmd -v)
Please check /var/adm/messages (assuming this is Solaris media server) for Media Manager startup.
Issues with process startup and/or comms with ACSLS server and/or robot control host should be logged here.
10-12-2016 11:41 AM
If solaris - as root at the prompt:
run tpconfig
hit enter
select 2 for robot config - it will display the robots. Make sure you only have the correct robots assigned.
Hit Q to exit each screen back to prompt - make sure you do no edits only display, because you will cause issues!
I have seen where my solaris media servers will allocate multiple acsls servers, and cannot find the correct drives until the wrong ones are removed. Please verify your robots.
10-12-2016 05:32 PM
Have you checked this technote:
http://www.veritas.com/docs/000038499
Try enabling these logs under /usr/openv/volmgr/debug:
tpcommand
robots
ltid
daemon
reqlib
With VERBOSE in vm.conf, you shall be able to get some error message out of them.
10-12-2016 06:38 PM
Hi Marianne & others,
Thanks for your reply.
Logs from /var/adm/messages
1 ) cat messages |grep vmd
Oct 12 20:59:19 vmd[24320]: [ID 631293 daemon.notice] terminating - successful (0)
Oct 12 20:59:19 vmd[24320]: [ID 715111 daemon.error] volume daemon terminating because it received a signal (15)
Oct 12 20:59:19 vmd[24320]: [ID 164182 daemon.error] terminating - daemon terminated (7)
Oct 12 21:01:13 vmd[2236]: [ID 617826 daemon.notice] ready for connections
2) LTID debug logs
21:01:12.034 [2230] <4> ltid: INITIATING
21:01:12.053 [2230] <2> mm_getnodename: cached_hostname, cached_method 3
21:01:12.080 [2230] <2> mm_getnodename: (3) hostname <MEDIA SERVER NAME>(from mm_master_config.mm_server_name)
21:01:12.080 [2230] <4> ltid: EMM tuning parameters: emm_retries = 1, emm_connect_timeout = 20, emm_request_timeout = 300
21:01:12.080 [2230] <4> ltid: using Scan Host status interval 300
21:01:12.080 [2230] <4> ltid: auto_correct_paths set to 0
21:01:14.090 [2230] <4> InitLtidEMM: emmserver_name = <EMM SERVER NAME>
21:01:14.090 [2230] <4> InitLtidEMM: emmserver_port = 1556
21:01:14.090 [2230] <4> InitLtidEMM: ThisHost = <MEDIA SERVER NAME>
21:01:14.105 [2230] <4> RedirectNBLoggerToLegacyLog: Setting up redirection of VxUL messages to legacy log
21:01:14.109 [2230] <4> RedirectNBLoggerToLegacyLog: Verbosity from VxUL configuration: 1
21:01:14.109 [2230] <4> RedirectNBLoggerToLegacyLog: logmsg verbosity set: 1
21:01:14.118 [2230] <2> Orb::init: Enabling ORBNativeCharCodeSet UTF-8(Orb.cpp:708)
21:01:14.120 [2230] <2> Orb::init: initializing ORB EMMlib_Orb with: ltid -ORBSvcConfDirective "-ORBDottedDecimalAddresses 0" -ORBSvcConfDirective "static Resource_Factory '-ORBNativeCharCodeSet UTF-8'" -ORBSvcConfDirective "static PBXIOP_Factory '-enable_keepalive'" -ORBSvcConfDirective "static EndpointSelectorFactory ''" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory PBXIOP_Factory'" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory IIOP_Factory'" -ORBDefaultInitRef '' -ORBSvcConfDirective "static PBXIOP_Evaluator_Factory '-orb EMMlib_Orb'" -ORBSvcConfDirective "static Resource_Factory '-ORBConnectionCacheMax 1024 '" -ORBSvcConf /dev/null -ORBSvcConfDirective "static Server_Strategy_Factory '-ORBMaxRecvGIOPPayloadSize 268435456'"(Orb.cpp:923)
21:01:14.132 [2230] <2> Orb::init: caching EndpointSelectorFactory(Orb.cpp:938)
21:01:14.134 [2230] <2> Orb::setOrbConnectTimeout: timeout seconds: 50(Orb.cpp:1570)
21:01:14.134 [2230] <2> Orb::setOrbRequestTimeout: timeout seconds: 1800(Orb.cpp:1579)
21:01:14.158 [2230] <4> InitLtidEMM: connected to EMM server
21:01:14.158 [2230] <2> Orb::setOrbRequestTimeout: timeout seconds: 300(Orb.cpp:1579)
21:01:14.158 [2230] <2> mm_getnodename: (0) hostname <MEDIA SERVER NAME>(from cached_hostname)
21:01:14.159 [2230] <2> retrieveLocalPatchVersion: Reading from /usr/openv/netbackup/version
21:01:14.159 [2230] <2> parsePatchVersionString: parsing = >7.6.0.3
<
21:01:14.159 [2230] <2> parsePatchVersionString: theRest = ><
21:01:14.200 [2230] <4> InitLtidEMM: Device Mappings version in EMM database is 1.120
21:01:14.212 [2230] <4> InitLtidEMM: Local device mapping is up-to-date
21:01:14.394 [2230] <4> InitLtidEMM: Releasing media server allocations
21:01:14.402 [2230] <2> Orb::init: Enabling ORBNativeCharCodeSet UTF-8(Orb.cpp:708)
21:01:14.402 [2230] <2> Orb::init: initializing ORB Default_CLIENT_Orb with: Unknown -ORBSvcConfDirective "-ORBDottedDecimalAddresses 0" -ORBSvcConfDirective "static Resource_Factory '-ORBNativeCharCodeSet UTF-8'" -ORBSvcConfDirective "static PBXIOP_Factory '-enable_keepalive'" -ORBSvcConfDirective "static EndpointSelectorFactory ''" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory PBXIOP_Factory'" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory IIOP_Factory'" -ORBDefaultInitRef '' -ORBSvcConfDirective "static PBXIOP_Evaluator_Factory '-orb Default_CLIENT_Orb'" -ORBSvcConfDirective "static Resource_Factory '-ORBConnectionCacheMax 1024 '" -ORBSvcConf /dev/null -ORBSvcConfDirective "static Server_Strategy_Factory '-ORBMaxRecvGIOPPayloadSize 268435456'"(Orb.cpp:923)
21:01:14.404 [2230] <2> Orb::init: caching EndpointSelectorFactory(Orb.cpp:938)
21:01:17.791 [2230] <16> InitLtidDeviceInfo: Drive DLT7_53_DP_19 is ACTIVE
21:01:17.792 [2230] <16> InitLtidDeviceInfo: Drives are still assigned on this host
21:01:47.792 [2230] <4> CheckShutdownWhileInit:
21:01:48.041 [2230] <16> InitLtidDeviceInfo: Drive DLT7_53_DP_19 is ACTIVE
21:02:18.042 [2230] <4> CheckShutdownWhileInit:
21:02:18.042 [2230] <4> CheckShutdownWhileInit: Pid=1, Data.Pid=2582, Type=100, Param1=0, Param2=-1, LongParam=2106721584
21:02:18.290 [2230] <16> InitLtidDeviceInfo: Drive DLT7_53_DP_19 is ACTIVE
3) nbrbutil -resetMediaServer <mediaserver>, releaseMDS did not release the allocation for DLT7_53_DP_19
10-12-2016 07:45 PM
10-12-2016 08:52 PM
/var/adm/messages
Oct 12 20:02:59 inetd[5532]: [ID 317013 daemon.notice] bgssd[25087] from 150.234.116.184 53306
Oct 12 20:59:19 vmd[24320]: [ID 631293 daemon.notice] terminating - successful (0)
Oct 12 20:59:19 vmd[24320]: [ID 715111 daemon.error] volume daemon terminating because it received a signal (15)
Oct 12 20:59:19 vmd[24320]: [ID 164182 daemon.error] terminating - daemon terminated (7)
Oct 12 21:01:13 vmd[2236]: [ID 617826 daemon.notice] ready for connections
Oct 12 21:01:17 ltid[2230]: [ID 533143 daemon.error] ltid can not be started while resources are assigned to the host.
Oct 12 21:03:01 inetd[5532]: [ID 317013 daemon.notice] bgssd[4375] from 150.234.116.184 39884
Oct 12 22:03:03 inetd[5532]: [ID 317013 daemon.notice] bgssd[13928] from 150.234.116.184 57575
Oct 12 23:03:05 inetd[5532]: [ID 317013 daemon.notice] bgssd[24184] from 150.234.116.184 39725
Oct 12 23:03:46 inetd[5532]: [ID 317013 daemon.notice] bgssd[24404] from 150.234.116.184 44176
Oct 12 23:03:46 inetd[5532]: [ID 317013 daemon.notice] bgssd[24406] from 150.234.116.184 44206
Oct 12 23:03:46 inetd[5532]: [ID 317013 daemon.notice] bgssd[24411] from 150.234.116.184 44251
10-12-2016 09:11 PM
10-12-2016 09:52 PM
MdsAllocation: allocationKey=55846077 jobType=1 mediaKey=4035269 mediaId=W91587 driveKey=2001252 driveName=DLT7_53_DP_19 drivePath=/dev/rmt/7cbn stuName=DP19_VTL masterServerName=wspebrms01-ebr mediaServerName=<media server name> ndmpTapeServerName= diskVolumeKey=0 mountKey=0 linkKey=0 fatPipeKey=0 scsiResType=1 serverStateFlags=1
10-12-2016 10:29 PM
10-12-2016 10:56 PM
I have cleared all lock files and started NBU. Sorry, forgot to mention in the previous reply.
And this is the only MDS allocation for this media server. Tried releasing this allocation. it did not work.
Haven't run nbrbutil -resetall. Please advice whether it will affect other jobs running on master.
10-12-2016 11:14 PM
10-13-2016 12:13 AM
10-13-2016 11:23 AM - edited 10-13-2016 11:27 AM
nbrbutil -releaseMDS 55846077
this did not help !!!?? what does it say ??
i see that this tape W91587 is in the drive, have you managed to get this tape out from the drive ? use robtest or eject the tape from the libray console and see if you can releaseMDS with nbrbutil command again