cancel
Showing results for 
Search instead for 
Did you mean: 

AIR Replication not working

lt13624
Level 4

Hi All,

I have had NBU set up that uses AIR from site A to site C and site B to site C. These are all 5230 appliances. Site A to C works fine (5 appliances). Site B to C seems (2 appliances our of 5) seem to have stopped working since site C was rebooted. This happened after a a FW update to the RAID controller. The trouble is, all 5 appliances at site C had the same FW update. I cant seem to figure it out. All other local backups work fine.

DNS name resolution works fine. I check for the relevant FW ports and they all work fine. So MSDP to MSDP on port spad/10102 and spoold/10082 works OK. I test this by using telnet from site B to C. SLP names are identical on both sides. The jobs seem to start and look as if theyre progressing but then just hang with no error messages as of yet. I have seen various posts regarding this but no clear solution. They hang in that state for ages. Other appliances replicate from site B to site C fine, so the network link is OK.

Does anyone know why this would happen or seen this before? Im thinking there may be some form of .lock file or previous replications that died in messy way. I have tried using nbstutil cancel to clear jobs, then deleted SLPs and the remote target, rebooted both source and target and reconfigured everything to no avail. 

Im now stumped.

1 ACCEPTED SOLUTION

Accepted Solutions

lt13624
Level 4

Hi Mark,

Its fixed! 

Two things look like being the cause.

At our site we have the MTU set to 1300. These two appliances were set to 1500. Lesson learnt, always keep looking at network issues :)

The other odd thing is that both these appliances had a missing replication_host entry when running nbemmcmd -listhosts

nbemmcmd -addhost -machinename ******** -machinetype replication_host fixed this but it was odd why it was missing on both appliances?

Not until both these issues were fixed that it sprang to life :)

Thanks for your help as always

View solution in original post

7 REPLIES 7

Mark_Solutions
Level 6
Partner Accredited Certified

What was the firmware update you applied - hadn't noticed on for the 5230's?

Which version are you running (2.5.?)

How many networks on each appliance (backup and replication?)

lt13624
Level 4

Hi Mark,

Im running 2.6.0.1.

The replication appliances have only one network and site A works fine to those, its just site B. The source appliances (site A and B) have multiple networks but the routing table is the same and connectivity and DNS seems to work fine.

I was advised by Symantec to update the RAID FW to version 3.240.25-2382 and all upgrades worked fine.

Mark_Solutions
Level 6
Partner Accredited Certified

Do any work at all or does nothing replicate in that one direction?

lt13624
Level 4

Hi Mark,

I have 15 applainces in total. 5 in site A, 5 in site B and 5 in site C with 1-1 mapping.  A to C all work fine. 3 of B to C work fine but 2 do not. Every SLP replication job on the affected 2 does not seem to work. Nothing has changed, apart from the FW. I just wondered if an AIR job was hanging around when they got rebooted and there is some ".lock" file stopping AIR. Its odd, as the job starts but hangs for ever at the "Replicating images to target stotage server...." line

Mark_Solutions
Level 6
Partner Accredited Certified

Worth having a look at this too: http://www.symantec.com/docs/TECH204574

bearing in mind that lifecycle parameters are in the console now .. and that the agent.cfg is available on the appliance too

Mark_Solutions
Level 6
Partner Accredited Certified

General logs to work through for troubleshooting:

Master: admin, nbstserv, bpdbm

Media (appliances): bpdm, bptm, nbrmms and the spad log

lt13624
Level 4

Hi Mark,

Its fixed! 

Two things look like being the cause.

At our site we have the MTU set to 1300. These two appliances were set to 1500. Lesson learnt, always keep looking at network issues :)

The other odd thing is that both these appliances had a missing replication_host entry when running nbemmcmd -listhosts

nbemmcmd -addhost -machinename ******** -machinetype replication_host fixed this but it was odd why it was missing on both appliances?

Not until both these issues were fixed that it sprang to life :)

Thanks for your help as always