cancel
Showing results for 
Search instead for 
Did you mean: 

Windows Remote Agent / NDMP / Timeout issue for backup.

aea13
Level 2
Hello,

Have  a strange issue with a 2010 installation (issue was present in previous 12.5 as well). 

Have 2 servers: 
HCENT11 - Win2008 64bit Exchange, 2010 Backup Exec, Exchange Agent
HCENT1 - 2003 32bit IIS/WEB, 2010 Remote Agent

Backup on HCENT11 has a single job that backs up both the improtant things on HCENT11, Exchange stores, as well as is trying to backup important stuff from HCENT1 and its system state.  The HCENT11 items are first on the task list, HCENT1 stuff is last.

If i start backup exec and force the job to run manually, everything will work fine.  If i then log off HCENT11 and let the scheduled job run automatically within 24 hours, it will also work fine.

BUT, anything after that 24 hours (guesstimate), something times out or disconnects, and the job fails when it gets to backup up the HCENT1 server with the remote agent.

To the best of my knowledge, nothing else is using port 10000 on HCENT1.  When i stop the remote agent service, i can't telnet to the server on port 10000, if i turn the service on, i can connect and get a little bit of those random ansi characters.  To me that implies that the service is the only thing using the port.  (am i wrong there?)

Here's a sample of the appropriate data from my SGMon log:
NDMPAgentConnector:SetupConnection(): Data Server = HCENT1.aea13.org
BENGINE:  [03/05/10 02:02:28] [8636]     Could not resolve the "bews-ndmp" or the "ndmp" service, error code: 10109, using port 10000
BENGINE:  [03/05/10 02:02:28] [8636]     ndmpConnectEx: Querying the neighbour advertisement cache to discover information on 'HCENT1.aea13.org' ...
BENETNS:  [03/05/10 02:02:28] [7500]     NRDS API - client connected.
BENETNS:  [03/05/10 02:02:28] [7500]     Connecting to BE Database.
BESERVER: [03/05/10 02:02:28] [0000]     [21224] 03/05/10 02:02:28 -1 Client requested key (1267776295).
BESERVER: [03/05/10 02:02:28] [0000]     [21224] 03/05/10 02:02:28 "Cluster" key does not appear to be present in the registry
BESERVER: [03/05/10 02:02:28] [0000]     [21224] 03/05/10 02:02:28 Failed to open Microsoft cluster ()
BESERVER: [03/05/10 02:02:28] [0000]     [21224] 03/05/10 02:02:28 VCS cluster keys do not appear to be present in the registry
BESERVER: [03/05/10 02:02:28] [0000]     [21224] 03/05/10 02:02:28 Failed to open VCS cluster ()
BESERVER: [03/05/10 02:02:29] [0000]     [21224] 03/05/10 02:02:28 01 Server Configuration: Client added: 8
BESERVER: [03/05/10 02:02:29] [0000]     [21224] 03/05/10 02:02:28 -1 Client 'HCENT11' connected('','AEA13\aea13admin'): 0x5645fbf0
BENETNS:  [03/05/10 02:02:28] [7500]     Successfully connected to BE Database.
BENETNS:  [03/05/10 02:02:28] [7500]     Reading agent database record for HCENT1.aea13.org.
BENETNS:  [03/05/10 02:02:29] [7500]     Found agent record 12 for HCENT1.aea13.org.
BESERVER: [03/05/10 02:02:29] [0000]     [21224] 03/05/10 02:02:29 01 Server Configuration: Client removed: 7
BESERVER: [03/05/10 02:02:29] [0000]     [21224] 03/05/10 02:02:29 -1 Client 'HCENT11' Disconnected:0x5645fbf0
BENETNS:  [03/05/10 02:02:29] [7500]     Disconnected from BE Database.
BENETNS:  [03/05/10 02:02:29] [7500]     NRDS API - client disconnected.
BENGINE:  [03/05/10 02:02:50] [8636]     ndmpEstablishConnectionUsingNoSpecificAdapter: Could not connect to remote address 207.28.54.1 and port 10000 Errorno : 10060
BENGINE:  [03/05/10 02:03:11] [8636]     ndmpEstablishConnectionUsingNoSpecificAdapter: Could not connect to remote address 207.28.54.11 and port 10000 Errorno : 10060
BENGINE:  [03/05/10 02:03:32] [8636]     ndmpEstablishConnectionUsingNoSpecificAdapter: Could not connect to remote address 207.28.54.2 and port 10000 Errorno : 10060
BENGINE:  [03/05/10 02:03:32] [8636]     ndmpConnectEx: unable to connect using NetworkOptions to HCENT1.aea13.org
BENGINE:  [03/05/10 02:03:53] [8636]     ndmpConnectEx: All attempts to connect to 'HCENT1.aea13.org' failed
BENGINE:  [03/05/10 02:03:53] [8636]     ndmpConnectEx: failed to connect to 'HCENT1.aea13.org'
BENGINE:  [03/05/10 02:03:53] [8636]     NDMPAgentConnector::Connect: ndmpConnectEx() failed on server HCENT1.aea13.org, port: 0.
BENGINE:  [03/05/10 02:03:53] [8636]     NDMPAgentConnector::HandleConnectionError: ndmpConnect failed: Could not connect to the Remote Agent on the target machine HCENT1.aea13.org. Probably Remote Agent is not installed on the target system or is not up and running.

and this is the result i get in the email:

The job failed with the following error: The resource could not be backed up because an error occurred while connecting to the Backup Exec for Windows Servers Remote Agent. Make sure that the correct version of the Remote Agent is installed and running on the target computer. If the server or resource no longer exists, remove it from the selection list. Edit the selection list properties, click the View Selection Details tab, and then remove the resource. I tried setting the dynamic ports (in backup exec network options) to use a range of 10010-10060, but i think those settings don't take affect until the initial 10000 connection is made, so aren't of use here.  But the change doesn't seem to affect anything negatively either, since it will still work fine when i run the job manually with either the default wide values or the specific ones.

Can anyone clue me in as to why its failing when i take my eyes off of it?

TIA.

andy
2 REPLIES 2

STeve_O
Level 5
The 100060 is a operating system error.  One thing you can try is edit C:\windows\system32\drivers\etc\servcies file and change the NDMP port from default 10000 to something like 20000.  Restart the Backup Exec services on each server.   You have to edit the file on each server so both are talking on the same port.

http://seer.entsupport.symantec.com/docs/255174.htm

Also it sound like you have a firewall or somethign in between that may be shutting down NDMP or idle NDMP ports that may explain why it work randomaly.


aea13
Level 2
Thanks for the input.  I tried adding port 20000 to each of the service files (ndmp didn't exist in either file before) on the two servers and restarting the BE services, but same result.   When i log in to the backup server, open the backup exec, and force the job to run, it works.  12 hours later when its supposed to run itself, it errors when getting to the remote agent again.

windows firewall is off on both.

is ndmp a service that should be running?  i don't see it listed by name or acronym, maybe i'm missing something?