12-15-2011 01:27 AM
Hi,
I am not able to get media manager daemons after bouncing Netbackup Daemons.
And i got this error
# ./vmoprcmd -d
oprd returned abnormal status (96)
IPC Error: Daemon may not be running
Logs of daemon & reqlib & ltid as below:-
Ltid:-
01:21:23.746 [23332] <4> CheckShutdownWhileInit: Pid=1, Data.Pid=25157, Type=100, Param1=0, Param2=-5056, LongParam=-23490544
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-01 is ACTIVE
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-02 is ACTIVE
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-03 is ACTIVE
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-06 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-07 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-08 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-10 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-12 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive IBM_ULTRIUM2_Drv07 is ACTIVE
reqlib:-
01:00:01.207 [23580] <2> EndpointSelector::select_endpoint: performing call with the only endpt available!(Endpoint_Selector.cpp:431)
01:00:01.220 [23580] <2> EndpointSelector::select_endpoint: performing call with the only endpt available!(Endpoint_Selector.cpp:431)
01:01:40.430 [23805] <4> vmoprcmd: INITIATING
01:01:40.488 [23805] <2> vmoprcmd: argv[0] = vmoprcmd
01:01:40.488 [23805] <2> vmoprcmd: argv[1] = -d
01:01:40.488 [23805] <2> vmdb_start_oprd: received request to start oprd, nosig = yes
01:01:40.527 [23805] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2046: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
01:01:40.527 [23805] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2060: service: vmd
01:01:40.623 [23805] <2> getrequestack: server response to request: REQUEST ACKNOWLEDGED 650
01:01:40.649 [23805] <2> getrequeststatus: server response: EXIT_STATUS 0
01:01:40.649 [23805] <4> vmdb_start_oprd: vmdb_start_oprd request status: successful (0)
01:02:57.220 [23805] <2> wait_oprd_ready: oprd response: EXIT_STATUS 278
01:02:57.221 [23805] <2> put_string: cannot write data to network: Broken pipe (32)
01:02:57.221 [23805] <16> send_string: unable to send data to socket: Broken pipe (32), stat=-5
Deamon:-
01:07:42.537 [24144] <4> rdevmi: INITIATING
01:07:42.537 [24144] <2> mm_getnodename: cached_hostname tcppapp001, cached_method 3
01:07:42.583 [24144] <2> mm_ncbp_gethostname: GetNBUName <tcppapp001-bip>
01:07:42.583 [24144] <2> mm_getnodename: (5) hostname tcppapp001-bip (from mm_ncbp_gethostname)
01:07:42.584 [24144] <2> rdevmi: got CONTINUE, connecting to ltid
01:08:02.755 [24098] <16> oprd: device management error: IPC Error: Daemon may not be running
01:08:03.241 [23355] <2> vmd: TCP_NODELAY
01:08:03.242 [23355] <4> peer_hostname: Connection from host tcppvmg265-bip, 172.16.6.7, port 6735
01:08:03.335 [23355] <4> process_request: client requested command=43, version=4
01:08:03.335 [23355] <4> process_request: START_OPRD requested
01:08:03.369 [23355] <4> start_oprd: /usr/openv/volmgr/bin/oprd, pid=24164
01:08:03.638 [23355] <2> vmd: TCP_NODELAY
01:08:03.638 [23355] <4> peer_hostname: Connection from host tcppvmg265-bip, 172.16.6.7, port 18394
01:08:03.685 [23355] <4> process_request: client requested command=43, version=4
01:08:03.685 [23355] <4> process_request: START_OPRD requested
Please help me out how to get rid of this issue and what is this issue about.
12-15-2011 01:42 AM
Master or media server? OS? NBU version?
What processes are currently running? Please post output of 'bpps -x' from cmd on master.
12-15-2011 02:39 AM
Hi Nitesh,
I have faced this error quite a few time in my environment and most;ly it happens when the load is more on the master or the media server.
running the command nbrbutil -resetMediaServer <mediaserver name> on the affected media server fixes the problem for me.The next step i do if the command doesnt fix the problem is recycle netbackup services on the affected media server.But most of the time nbrbutil will do the trick.
You might want to try the nbrbutil command.The command Resets all nbrb EMM and MDS allocations related to ltid on the media server. Your backup on that media server might be affected though.
If the command works for you.Please look at your backup scheduling and consider load balancing.
12-28-2011 10:58 PM
Thanx Kisad
We were getting the same error, we tried the solution provided by Kisad, stopped NBU processes on media server and then rest the media server by nbrbutil command
nbrbutil -resetMediaServer media_server_name
and again started the netbackup, it resolved the issue and now everything is working fine.
Please share your experience also.
12-29-2011 10:22 PM
Seems Nitesh never came back to check for replies....
12-30-2011 05:04 AM
Hi champ35,
From my experience I have seen this kind of situation occuring when large number of backups are running in the environment.In other words it would be worthwhile to review you backup policy schedules and do a load balancing.Hope it helps.
Wish you a Happy and Prosperous New year.
04-04-2012 12:25 AM
Thanx Kisad,
It had help me a lot.
nbrbutil -resetMediaServer media_server_name command has resolved the issue.
Regards,
Nitesh Bhat
04-04-2012 12:34 AM
Please mark Kisad's post as solution.
10-13-2022 04:15 PM
THANK YOU! This solved my issue perfectly after everything else failed. Your amazing and a lifesaver even 11 years later.