04-03-2014 09:36 AM
I have a new 18.104.22.168 OpsCenter running on RHEL, based in the USA, which will be used as a global OC as part of a restructure of the backup team in the UK/USA/AP, having addeed two 22.214.171.124 Master Servers and two 126.96.36.199 master servers I am having an issue with one of the 188.8.131.52 servers, it failes to complete the data collection for the BMR, Skipped Files & Job Throughput Data Type, I get the following error mesage Nbsl/Agent communication exception]: org.omg.CORBA.COMM_FAILURE: vmcid: 0x0 minor code: 0 completed: Maybe
At present the Master Server is also connected to a local UK 75.0.7 OC and this has no errors at all, could this be due to the NBSL process being hammered as it is on two servers and also uses SLP Duplication?
05-09-2014 07:07 AM
Even I have one of the severs for which few of the data types are not collected. This seems to be an issue. I have faced this issue 7.5 version and seems to persist in 7.6 itself. Given the fact i guess its the OpsCenter Server which is not able to collect the data. Please check you memory utilization on OpsCenter server and check if you see any heap size error.
05-09-2014 04:04 PM
1. Could you please confirm that bp.conf of 184.108.40.206 master server (having issue) is updated with OpsCenter server name ?
OPS_CENTER_SERVER_NAME = <server_name>
2. On opscenter, data collection tab, agent is checked?
If yes then please uncheck the same.
3. Restart Services on OpsCenter server.
4. Restart NetBackup services or NBSL service on NetBackup Master server.
Let us know the result :)
05-13-2014 03:27 AM
It seems to be a case where two OpsCenter servers (local UK 220.127.116.11 and a global 18.104.22.168 OpsCenter server) are connected and collecting data from one master server.
This type of configuration (one master is connected and monitored by multiple OpsCenter servers) is not supported. And in case of such configuration result could be undefined and there could be chances of failure of some types of data collection.
Exception CORBA : :COMM_FAILURE is very generic but it could be due to some network and / or connectivity issue.
In order to localize the issue can we disable the data collection from the local OpsCenter server and let the data collection continue with global OpsCenter server?
Please let us know does it make any difference in data collection.
05-13-2014 03:32 AM
Can you please let us know which types of data collection are failing? And what is the error that OpsCenter server monitor is showing?
What is the master server version?
Is the same type of data collection failing after the OpsCenter upgrade or its different?
05-14-2014 04:26 AM
Apologies for the delay in responding, I have been off for a while.
I have checked the bp.conf on the servers and they are all fine, one thing I did do was to bounce the OpsCenter server and restart everything including readding the server and this has stabalised over the past few weeks.
I appreciate that our current config is not supported but as part of a migration from a local backup environment to a global one we wanted to ensure that things would work correctly, not sure how easy it is to migrate the database from a 22.214.171.124 Wintel OC to a 126.96.36.199 RHEL OC so went the lazy way :)
If anyone knows how I can transfer the custom reports from one to the other that would be great as I dont want to create them all by hand again