cancel
Showing results for 
Search instead for 
Did you mean: 

AIR and DR testing

Hi all

We use AIR between 2 sites, in order to have the possibility to restore from the other domain, if something bad happens. Now I want to test this, and to work out a written plan. I want to test restore of SQL, Oracle, Exchange and VMware. I have tried to find some documentation that covers this, with no luck. 

Yesterday we tried to test restore from Exchange. Exchange DAG cluster in production environment is so called IP-Less, which means the clustername is not configured with IP. All the nodes are though.

What do I need to configure on the DR Master, and on the DAG cluster to make this work?

 

Our environment is Masterservers 8.1.2 on Linux, several MSDP Linux servers, 5330 appliances and Access servers. 

Tags (2)
6 Replies

Re: AIR and DR testing

Welcome to the complex world of genuine DR recovery testing.  No vendor manual is ever going to be able to give any customer for any product a full broken down step-by-step restore procedure for "DR testing".  Trust me, the NetBackup manuals really do contain everything you need for normal restore/recovery - but when you want to change this to be a "test DR" recovery, and not even a "real DR" (god forbid) recovery (which would actually be easier than a "test DR" recovery) - then we are talking about three completely and wholly different pathways in just one breath.  *IF* you have only one network then each recovery scenario of... 1) real PROD recovery 2) real DR recovery 3) test DR recovery... all three are significantly different to each other to likely warrant different sets of site-specific procedures, but only because you would be trying to perform all three within the same network space.

I assume that both sites are fully network routable to each other... if so, your prime question is going to be how to spin up a "restored and recovered" DAG without impacting in anyway whatsoever the service offered by the live DAG.  And I think you know already know the answer to this... either...

1) shutdown the live DAG and CAS servers, attempt recovery on alternate hardware for DR DAG and CAS servers, and simply *hope* (fingers crossed) that Active Directory and DNS etc do not mind a DAG moving MAC addresses - and somehow hope that live emails don't start being received in this temporary DAG (which will be deleted after testing)... and hope that Active Directory isn't maintaining some kind or meta-data about the DAG, whereby Active Directory might get very upset when a DAG appears to time-travel backwards in time (twice!!)...

...or...

2) perform your test DR recovery in an isolated air-gapped network.

.

I know which I would choose.  Option 2).

With option 2) you get time to practice, again and again and again.  With option 2) you have no risk at all to live services.  With option 2) you get to test each database platform one at a time, or all at the same time.  And best of all, one of the prime benefits of option 2) is... that... because the test DR network is fully air-gapped it looks exactly like live, so you actually get a win-win-win... because... think about it... then your "test DR" procedures then become exactly like your "real DR" recovery procedures which in turn are exactly like your "real PROD" recovery procedures... all because your air-gapped "test DR" network is to all intents and purposes exactly like your production network (just a bit smaller is all).

.

The same "prod versus test" exposure question goes for all other "database" related services (SQL, Oracle) - i.e. how to "test DR" restore and recover of apparently live production database services without them actually being truely live.  The same is even sometimes true for whole VM recovery - but whole VM recovery testing can actually be a little bit easier sometimes, i.e. restore whole VM (but do not boot yet), then manually disengage/disable/strip-off/remove virtual networking from the VM, then boot the VM, then logon at VM console, check VM - and you just have to trust that all will be ok in a true DR situation when the VM is recovered with real live active networking.

Re: AIR and DR testing

Remember, NetBackup AIR is a super efficient platform for optimized replication of backup data, it is not a DR panacea.  NetBackup AIR does not "do" your DR for you.  All customers still always have to devise and develop their own highly site-specific additional steps to make that replicated backup data useful for both "test DR" and "real DR".  The same is always true for any backup product and any backup storage platform.

Re: AIR and DR testing

Thanks for your reply!

I see now that I should be more specific. I just want to restore a single database from the "DR" site, and restore it to a Recovery Database in the production Exchange DAG cluster. It will not be mounted, we just want to document procedure, and verify it works.

Reason for this is:

We do backups to 5330, and uses Access for long time retension. If something happens to Access server, AND we need to do a restore from longtime retension backups, those are replicated using AIR to the "DR" site. Then we have the possibility to use the catalogued copy in the "DR" netbackup domain.

 

Re: AIR and DR testing

MS Exchange version and patch level?  MS Exchange host OS version?  CAS host server OS version?  NetBackup Client version on all?

Re: AIR and DR testing

NBU 8.1.2 on master, media and clients.

Exchange 2016 CU12, running on win 2016 std version 10.0.14393 on VMware

Re: AIR and DR testing

Your oldest backups are retained within a separate NetBackup domain, and so you will need to manually push replicate this from DR site to PROD site, using the nbreplicate command - search this forum for recent examples of this command.  After that you should (assuming compatibility) be able to restore the very old mailbox database to the recovery database on your PROD DAG.

But, here's a question... about compatibility... what version and patch level was MS Exchange at when you took that old backup?  You are on MS Exchange 2016 CU12 right now, but who is to say that an old backup of a mailbox database from MS Exchange 2016 CU3 (for example) is restorable (i.e. is supported by Microsoft) in to an instance of MS Exchange 2016 CU12.

Here's a thing I like to do, which is to add a keyword to my MS Exchange policies which is the MS Exchange version and NetBackup Client version - e.g. MSEX2016CU12-NBU812 - that way I can query any old backup images and know what versions they were at the time of the backup.

Are there specific steps that you need help with ?  If so, then you will need to clearly describe your restore procedure and tell us exactly which step / action / part of your restore and recovery procedure that is failing or that you are stuck with, and then wait for our (lucky for us all) resident expert for MS Exchange for NetBackup (Lowell :)) to help out (hopefully).

All I can add is... have you first tested and proven by restoring one mailbox database from say yesterday's PROD DAG whilst that backup image still resides within your PROD NetBackup site ?  IMO, test recovery using same versions first, then test recovery from a very old backup.