cancel
Showing results for 
Search instead for 
Did you mean: 

Master server always "partially connected" - Ops Center/NetBackup 7.1.0.2

Niall_Porter
Level 4

Folks,

We recently upgraded our NetBackup Enterprise master/media server (one box) from 7.0 to 7.1.0.2.  Following this, we installed (for the first time) an Ops Center server and went straight to 7.1.0.2 for this also.  The master server runs RHEL 4.7 x64 and the Ops Center server runs on RHEL 5.5 x64.

The problem is that Ops Center always sees the NetBackup server as "Partially Connected".  I understand that it's normal to see this occasionally but it's constantly like this and checking in the Data Collection Status for this server shows the following:

Policy & Schedule Sep 19, 2011 9:12 AM Sep 19, 2011 9:12 AM Failed -
Scheduled Jobs Not Reported Sep 19, 2011 9:12 AM Failed -

NetBackup works just fine so I guess there's something wrong somewhere which is preventing Ops Center gathering data for the above data types.  Obviously the overview page you see when you log in to Ops Center shows no data for the reports which generate from this data.

I've checked connectivity and the PBX port 1556 is open between the two machines.  Anything else I should check for?

 

Thanks in advance,
Niall

12 REPLIES 12

MOHAMED_PATEL
Level 5
Partner Accredited Certified

Just for clarity - Opscenter is also running 7.1.0.2

 

For Netbackup 7.1 Master Servers:
add  the following line the scl.conf file to enable 'Breakout Jobs'

location:  /opt/SYMCOpsCenterServer/config

nbu.scl.collector.enableBreakupJobDataCollection=true

 =====

In addition have you registered OCA with the master server:

root@yourserver # nbregopsc -add opscenter.domain.com
Registering OpsCenter (opscenter.domain.com) with NB
Successfully registered

Registering NB with OpsCenter (opscenter.domain.com)


This will add a new entry in bp.conf
OPS_CENTER_SERVER_NAME = opscenter.domain.com

Niall_Porter
Level 4

Thanks Mohamed,

Yes, Ops Center is also 7.1.0.2, as reported from the "About" link on the web interface:

Version 7.1.0.2 (Build 20110825)

 

I checked the bp.conf file on the master and the line for the "OPS_CENTER_SERVER_NAME" was already there and contained the hostname of the Ops Center server.  I successfully pinged the Ops Center server hostname and got a telnet connection to it on port 1556 using the exact hostname in the bp.conf file.

I checked the scl.conf file on the Ops Center server and it did not have the line for BreakupJobs so I added that and restarted all the Ops Center processes.  However the problem remains the same - the server still shows as "Partially Connected" and the same data types as listed in my original post are marked as "Failed"...

MartinVi
Level 6

Hi Niall and Mohamed,

we've got the same problem in our NetBackup environmet and we have also no solution for it. I tried also to change the scl.conf but without success. We have three Master servers configured in OpsCenter 7.1.0.2. For two it works without problems. I don't know we're is the problem. I hope someone could help us with this. 

Regards,

Martin

bartman10
Level 4

anyone figure this out? I was hoping to find a solution without having to open a ticket...

Symboy
Level 6
Accredited Certified

In case if any specific data collection is failing , OC server logs ( OID 148 ) is the best place to start with .

 

In many cases , i have seen that it fails while connectign information related to virtual machines that is listed  in NBU adminConsole but has been delete from the environment . 

Please check if machines listed are all valid using steps below.

 

1) NBU Console -->> Credentials -->> Virtual Machine Servers .

2) NBEMMCMD -LISTHOSTS . 

 

  This is just one reaon out of many , due to which data collector might fail . If it does not not help , go through 148 ( OC Server ) logs from opscenter and NBSL ( Nebackup Sevice Layer ) OID 132 from Master server , both with highest verbose .

 

Hope that helps 

D_Flood
Level 6

I've been following this thread since my Windows based masters have been reporting the same thing.

 

I finally found this article which appears to have cleared it up.  The partially connected master was only partially configured for vmware....

 

http://www.symantec.com/business/support/index?page=content&id=TECH173926&actp=search&viewlocale=en_US&searchid=1322251132630

Symboy
Level 6
Accredited Certified

This is what I pointed to you in the previous comment . If you would have seen 148 logs , it would say data collection for virtual machine was failing .

D_Flood
Level 6

Well, it didn't fix it completely but at least it sometimes is fully connected.  But it is still disconnected more than connected.

And I wasn't able to find the suggested log file(s) with any of the supplied information.  May I suggest that the article be updated with the path and name of the log file(s)?

tom_sprouse
Level 6
Employee Accredited Certified

The OpsCener Server Logs are located under : ./installpath/OpsCenter/Server/logs/

The 148 logs will look like this example: 58330-148-1108492810-111130-0000001030.log

 

daza
Level 4
Partner Accredited

same here, we have Netbackup 7.0.1 master, Solaris 10 x64, virtual machine and just installed OpsCenter 7.1 on Windows 2008R2 x64 also vm box. we did all steps above, but still issues with data collection and status is always "partially connected". here is 148 log:

12/2/2011 10:55:28.515 [Info] DataCollectionStatusManager updateMasterServerStat
us dataCollectorId : 1 status : PARTIALLY_CONNECTED
12/2/2011 10:55:28.515 [Error] ScheduleJobCollector master01 initialSync
failed  - com.symantec.nbu.nom.scl.common.agent.exception.AgentNotConnectedExcep
tion: Agent is in DISCONNECTED state    at com.symantec.nbu.nom.scl.common.agent
.AgentContext.getAgent(AgentContext.java:264)   at com.symantec.nbu.nom.scl.ncl.
agent.AgentManager.getPolicyAgent(AgentManager.java:651)        at com.symantec.
nbu.nom.scl.ncl.collector.policy.ScheduleJobCollector.getPolicyAgent(ScheduleJob
Collector.java:93)      at com.symantec.nbu.nom.scl.ncl.collector.policy.Schedul
eJobCollector.sync(ScheduleJobCollector.java:57)        at com.symantec.nbu.nom.
scl.common.collector.PullCollectorContext.pull(PullCollectorContext.java:133)
at com.symantec.nbu.nom.scl.common.collector.PullCollectorContext.initialSync(Pu
llCollectorContext.java:103)    at com.symantec.nbu.nom.scl.common.collector.Pul
lCollectorContext.submitInitialSyncTask(PullCollectorContext.java:275)  at com.s
ymantec.nbu.nom.scl.common.collector.PullCollectorContext.start(PullCollectorCon
text.java:79)   at com.symantec.nbu.nom.scl.ncl.collector.policy.PolicyCollector
Manager.start(PolicyCollectorManager.java:47)   at com.symantec.nbu.nom.scl.comm
on.agent.AgentContext.connect(AgentContext.java:126)    at com.symantec.nbu.nom.
scl.common.agent.ConnectionTask.run(ConnectionTask.java:39)     at com.symantec.
nbu.common.threadpool.ThreadDecorator.run(ThreadDecorator.java:42)      at java.
util.concurrent.Executors$RunnableAdapter.call(Unknown Source)  at java.util.con
current.FutureTask$Sync.innerRun(Unknown Source)        at java.util.concurrent.
FutureTask.run(Unknown Source)  at java.util.concurrent.ScheduledThreadPoolExecu
tor$ScheduledFutureTask.access$301(Unknown Source)      at java.util.concurrent.
ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)     at java.
util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)       at java.
util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)   at java.lang.Thr
ead.run(Unknown Source)
12/2/2011 10:55:28.515 [Error] PullCollectorContext<ScheduleJobCollector> master01 pull failed  - com.symantec.nbu.nom.scl.common.agent.exception.AgentNo
tConnectedException: Agent is in DISCONNECTED state     at com.symantec.nbu.nom.
scl.common.agent.AgentContext.getAgent(AgentContext.java:264)   at com.symantec.
nbu.nom.scl.ncl.agent.AgentManager.getPolicyAgent(AgentManager.java:651)
at com.symantec.nbu.nom.scl.ncl.collector.policy.ScheduleJobCollector.getPolicyA
gent(ScheduleJobCollector.java:93)      at com.symantec.nbu.nom.scl.ncl.collecto
r.policy.ScheduleJobCollector.sync(ScheduleJobCollector.java:57)        at com.s
ymantec.nbu.nom.scl.common.collector.PullCollectorContext.pull(PullCollectorCont
ext.java:133)   at com.symantec.nbu.nom.scl.common.collector.PullCollectorContex
t.initialSync(PullCollectorContext.java:103)    at com.symantec.nbu.nom.scl.comm
on.collector.PullCollectorContext.submitInitialSyncTask(PullCollectorContext.jav
a:275)  at com.symantec.nbu.nom.scl.common.collector.PullCollectorContext.start(
PullCollectorContext.java:79)   at com.symantec.nbu.nom.scl.ncl.collector.policy
.PolicyCollectorManager.start(PolicyCollectorManager.java:47)   at com.symantec.
nbu.nom.scl.common.agent.AgentContext.connect(AgentContext.java:126)    at com.s
ymantec.nbu.nom.scl.common.agent.ConnectionTask.run(ConnectionTask.java:39)
at com.symantec.nbu.common.threadpool.ThreadDecorator.run(ThreadDecorator.java:4
2)      at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)        at java.
util.concurrent.FutureTask.run(Unknown Source)  at java.util.concurrent.Schedule
dThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source)      at java.
util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Sour
ce)     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source
)       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
12/2/2011 10:55:28.515 [Info] PullCollectorContext<ScheduleJobCollector> master01 state change : INITIAL_SYNC -> STOPPED
12/2/2011 10:55:28.515 [Info] PullCollectorContext<ScheduleJobCollector> master01 isBlackOutPeriod exiting. return : false
12/2/2011 10:55:28.546 [Info] DataCollectionStatusManager updateMasterServerStat
us dataCollectorId : 1 status : PARTIALLY_CONNECTED
12/2/2011 10:55:28.562 [Info] PullCollectorContext<ScheduleJobCollector> master01 pull exiting
12/2/2011 10:55:28.562 [Info] PullCollectorContext<ScheduleJobCollector> master01 initialSync exiting

eMdzeJ
Level 2

We have the same problem. OpsCenter and NetBackup both 7.1.0.3 ver. It`s problem with VMware Access Host for sure. Without it server connects normally. I also discovered that this problem occurs only if any of servers (media or master) is selected as VMware Access Host. When I choose a normal NetBackup client as this host server connects normally. Mayby this fact will help to solve this problem.

D_Flood
Level 6

I figured out what was still causing the disconnect in my case even after deleting all VMware hosts.  The one that disconnects is running 2K3 and the NetBackup Proxy Service keeps shutting down.  If I restart it then OpsCenter (running on 2k8) changes state to fully connected.

 

Now to figure out what's killing the Proxy Service...

 

I do plan on upgrading to 7.1.0.3 but not until the new year (the 2k3 box has the ghost job issue also).

<edit>

I see that 7.1.0.3 is now listed as having the etrack fix for the Proxy Service dying so I guess that's another reason to upgrade.

 

http://www.symantec.com/business/support/index?page=content&id=TECH157163&actp=search&viewlocale=en_US&searchid=1324596462657