Forum Discussion

sjaavid's avatar
sjaavid
Level 2
6 years ago

AIR and DR testing

Hi all

We use AIR between 2 sites, in order to have the possibility to restore from the other domain, if something bad happens. Now I want to test this, and to work out a written plan. I want to test restore of SQL, Oracle, Exchange and VMware. I have tried to find some documentation that covers this, with no luck. 

Yesterday we tried to test restore from Exchange. Exchange DAG cluster in production environment is so called IP-Less, which means the clustername is not configured with IP. All the nodes are though.

What do I need to configure on the DR Master, and on the DAG cluster to make this work?

 

Our environment is Masterservers 8.1.2 on Linux, several MSDP Linux servers, 5330 appliances and Access servers. 

  • Welcome to the complex world of genuine DR recovery testing.  No vendor manual is ever going to be able to give any customer for any product a full broken down step-by-step restore procedure for "DR testing".  Trust me, the NetBackup manuals really do contain everything you need for normal restore/recovery - but when you want to change this to be a "test DR" recovery, and not even a "real DR" (god forbid) recovery (which would actually be easier than a "test DR" recovery) - then we are talking about three completely and wholly different pathways in just one breath.  *IF* you have only one network then each recovery scenario of... 1) real PROD recovery 2) real DR recovery 3) test DR recovery... all three are significantly different to each other to likely warrant different sets of site-specific procedures, but only because you would be trying to perform all three within the same network space.

    I assume that both sites are fully network routable to each other... if so, your prime question is going to be how to spin up a "restored and recovered" DAG without impacting in anyway whatsoever the service offered by the live DAG.  And I think you know already know the answer to this... either...

    1) shutdown the live DAG and CAS servers, attempt recovery on alternate hardware for DR DAG and CAS servers, and simply *hope* (fingers crossed) that Active Directory and DNS etc do not mind a DAG moving MAC addresses - and somehow hope that live emails don't start being received in this temporary DAG (which will be deleted after testing)... and hope that Active Directory isn't maintaining some kind or meta-data about the DAG, whereby Active Directory might get very upset when a DAG appears to time-travel backwards in time (twice!!)...

    ...or...

    2) perform your test DR recovery in an isolated air-gapped network.

    .

    I know which I would choose.  Option 2).

    With option 2) you get time to practice, again and again and again.  With option 2) you have no risk at all to live services.  With option 2) you get to test each database platform one at a time, or all at the same time.  And best of all, one of the prime benefits of option 2) is... that... because the test DR network is fully air-gapped it looks exactly like live, so you actually get a win-win-win... because... think about it... then your "test DR" procedures then become exactly like your "real DR" recovery procedures which in turn are exactly like your "real PROD" recovery procedures... all because your air-gapped "test DR" network is to all intents and purposes exactly like your production network (just a bit smaller is all).

    .

    The same "prod versus test" exposure question goes for all other "database" related services (SQL, Oracle) - i.e. how to "test DR" restore and recover of apparently live production database services without them actually being truely live.  The same is even sometimes true for whole VM recovery - but whole VM recovery testing can actually be a little bit easier sometimes, i.e. restore whole VM (but do not boot yet), then manually disengage/disable/strip-off/remove virtual networking from the VM, then boot the VM, then logon at VM console, check VM - and you just have to trust that all will be ok in a true DR situation when the VM is recovered with real live active networking.

    • sdo's avatar
      sdo
      Moderator

      Remember, NetBackup AIR is a super efficient platform for optimized replication of backup data, it is not a DR panacea.  NetBackup AIR does not "do" your DR for you.  All customers still always have to devise and develop their own highly site-specific additional steps to make that replicated backup data useful for both "test DR" and "real DR".  The same is always true for any backup product and any backup storage platform.

      • sjaavid's avatar
        sjaavid
        Level 2

        Thanks for your reply!

        I see now that I should be more specific. I just want to restore a single database from the "DR" site, and restore it to a Recovery Database in the production Exchange DAG cluster. It will not be mounted, we just want to document procedure, and verify it works.

        Reason for this is:

        We do backups to 5330, and uses Access for long time retension. If something happens to Access server, AND we need to do a restore from longtime retension backups, those are replicated using AIR to the "DR" site. Then we have the possibility to use the catalogued copy in the "DR" netbackup domain.