cancel
Showing results for 
Search instead for 
Did you mean: 

DR Testing Troubles

pinchaser
Level 3
My Environment is:

Master Server 6.0MP5 (not media server)
2 @ Media Servers 6.0MP5 both with attached SAN Storage Units and SL500 w/LTO2 Drives.

In my DR environment I do not have the media servers but do have both SAN and LTO2 drives runing on Solaris.  I am trying to recover using FULL Catalog Backups according to Netbackup Documentation. However I run into several issues:

1. The recovery wants to communicate with the media servers which are not in my DR environment. (As a work around, I have added entries into the hosts file for these media servers which point back to the DR Master (itself)). This seems to work and allows the recovery to complete. At least after the completion I can see all policies, I can query to backup catalog and verify previous backups...etc.  Is there something that I am doing incorrectly here? I mean, I can not be the first person in the world that has done this.

2.  Once the above is completed, I can not seem to delete the (fake) Media Servers and reconfigure the DR Master to see the attached tape and SAN devices.

If anyone has done something similar please let me know what I am doing wrong.

Thanks
12 REPLIES 12

Omar_Villa
Level 6
Employee
check this note, you need the same topology on your DR site
 
 
 
regards

pinchaser
Level 3
Ok... let me phrase this differently....
 
In a true disaster where my whole enviornment is destroyed... I will not have my original media servers... I will start with just a master media server. I need to be able to recover all policies and tape images to be able to start recovering clients.
 
The link provided is only a solution where the master server is being replaced in the exact enviornment where it was damaged. This is not the disaster I am trying to recover from.
 
Thanks

dukbtr
Level 4
We our actually going to be testing the same type of DR testing next week.  Our master at the DR site will have a different name and aslso be a media server.  We were also going to do the IP masking to point back to the new box.  I'll try and remeber to post what our findings are on this matter.

Mark_Glazerman
Level 3
We had exactly the same issue at 2 DR tests last year.   In the first test we spent countless hours waiting for the spinning V of death (as we labelled it) to stop but as you stated, it was trying to find our SAN attached media servers and even our LTO tape drives which were not present at our DR site.  Because we couldn't get it to ignore these entries we could never get to the point where we could configure the tape drives at the DR site to start recovering the data on tape.

Between that test and the next one we talked in depth with both our SE and Symantec tech support who assured us that there was a CLI command we could run which would initiate a partial catalog recovery leaving out our tape devices and SAN attached media servers.  Surprise surprise this didn't work either and after about 16 hours on the phone to support they confirmed that this would never work and was probably corrupting our EMM database by not restoring all of it components.

Our solution was to make use of spare cabinet space in a rack we have on a managed services floor at our DR site to house an identical master and media server, configured exactly like our production boxes. During a DR test we now just power the master server up and do a catalog restore from our disk based storage (Data Domain DD560 with replicated data from our production environment).  Once the master has the latest catalog restored we can start feeding up data to the boxes that we build from scratch at the DR site.  Works like a charm and had 1005 of our mission critical applications restored and operational within 16 hours at our last test.

Sorry for the wordy post.

Mark


pinchaser
Level 3
Anyone have input?

mr_crosby
Not applicable
I have a very similar problem, managed today to recover the catalog only to have the master on the live site stop  working at midday. Now have the problem of the master server on the DR site not able to see the  SL500 tape library......

loco_nbu
Level 3

I have a similar issue..

Data Center

1 Windows 2003 Master Server 6.5.1a

4 Windows 2003 Media Servers 6.5.1.a

2 SL500's with Expansion Modules

1 VTL

DR Site

1 Windows 2003 Master Server 6.5.1a

1 Windows 2003 Media Server 6.5.1a

1 SL500 no Expansion Module

1 VTL

...the problem is these are not the same topologies? I won't be getting anymore hardware for the DR site.

Now what? I can't restore the catalog at the DR site according to the documentation. Do I cluster the Master servers between the Data Center and the DR site?

Is there anyone who has successfully set up a DR site scenario without the same topology?

Thanks in advance for input...

seemayur
Level 3

OK, so here's how we did it,

 

Production Environment - 1 Windows 2003 Master 6.0mp5; 1 Windows 2003 Media; 1 VTL; 1 LTO3 library

DR environment - 1 Windows 2003 Master; 1 LTO3 Library

 

In production, we have the master server configured to backup the catalog directly to LTO.  This way when we get to DR, to get the catalog restored, only the Master server (with identical name & ip as prod) and an LTO3 drive are required.

 

For those of you that are using the DR recovery file that is saved as part of the Catalog backup, that text file specifies which server was used to do the backup of the catalog.  If your recovery server is different in DR, then you could modify that file so that it uses the alternate server.  We do this in a LAB environment.

 

 

loco_nbu
Level 3

Thanks for the response seemayur.

 

I have a dumb question though...how do you have identical ip's and server names for your DR site and your Data Center site without causing DNS issues? Is your DR site not live?

seemayur
Level 3

Sorry, I should have mentioned that.  We have a cold DR site, for that matter, we use Sungard for our recovery.

 

The onlything that we have on line are a couple of AD domain controllers & a documentation repository.

 

loco_nbu
Level 3

Ok, that makes sense then.

When importing the catalog at the DR site you haven't had any troubles with the different topologies as described in the documentation?

 

That's good to know and I'll be giving that a try in the near future. We also use SunGard but currently we have a hot site for everything. That might have to change for backups if I can get this to work.

mr_crosby2
Not applicable
I finally got it to work using just the same hostname with different IP's and different topology's. That was following Netbackup process using the recovery wizard, catalog_online file and only restoring the configuration and image files.