AIR Replication not working
Hi All,
I have had NBU set up that uses AIR from site A to site C and site B to site C. These are all 5230 appliances. Site A to C works fine (5 appliances). Site B to C seems (2 appliances our of 5) seem to have stopped working since site C was rebooted. This happened after a a FW update to the RAID controller. The trouble is, all 5 appliances at site C had the same FW update. I cant seem to figure it out. All other local backups work fine.
DNS name resolution works fine. I check for the relevant FW ports and they all work fine. So MSDP to MSDP on port spad/10102 and spoold/10082 works OK. I test this by using telnet from site B to C. SLP names are identical on both sides. The jobs seem to start and look as if theyre progressing but then just hang with no error messages as of yet. I have seen various posts regarding this but no clear solution. They hang in that state for ages. Other appliances replicate from site B to site C fine, so the network link is OK.
Does anyone know why this would happen or seen this before? Im thinking there may be some form of .lock file or previous replications that died in messy way. I have tried using nbstutil cancel to clear jobs, then deleted SLPs and the remote target, rebooted both source and target and reconfigured everything to no avail.
Im now stumped.
Hi Mark,
Its fixed!
Two things look like being the cause.
At our site we have the MTU set to 1300. These two appliances were set to 1500. Lesson learnt, always keep looking at network issues :)
The other odd thing is that both these appliances had a missing replication_host entry when running nbemmcmd -listhosts
nbemmcmd -addhost -machinename ******** -machinetype replication_host fixed this but it was odd why it was missing on both appliances?
Not until both these issues were fixed that it sprang to life :)
Thanks for your help as always