cancel
Showing results for 
Search instead for 
Did you mean: 

Large-ish restore (spanning media) fails to do ...anything!

jim_dalton
Level 6
Hello Forumfolks,
Im interested in knowing if anyone has encountered the same problem as Ive got -Im in need of a fix or a solution as it leaves me unable to restore a critical system in DR - making DR rather pointless.
 
To summarise: Ive backed up my application (SAP) when SAP was down. This backup ran to disk staging using itself as a media server (thats to say, not the master server).
Then the data on disk were later written to final storage, LTO2, again using itself as a media server. This one backup image spanned TWO LTO2 tapes. I run a share storage setup with the same drives shared across three media servers, one being the master.
 
Now I move to DR. I recover my master server, I can see my backup image on DISK (disk I now dont have as Im at DR), so I promote my tape images to be primary. Thats fine.
 
And now I pick off a sub directory of my backup and I submit the restore. Netbackup requests and receives the first tape, positions to file 1, then spends 45 minutes claiming its positioning: the job monitor tells me its positioned in 00:00:00 seconds (as expected since its the only data on tape), the restore job doesnt say this , it just says 'waiting for positioning of media id xxxxxx' and it still says this after 45 minutes.
 
I know restores work since Ive restored other data from the same DR configuration successfully, but the successful restore didnt involve an image spanning multiple volumes.
 
Additionally at a previous DR test the same set of actions were carried out, the only difference being we produced just one tape - the data have since then grown such that the backup runs to two and now it seems Im not able to restore.
 
Thanks in advance,Tim
6 REPLIES 6

jim_dalton
Level 6
I should add netb5.1mp5 on solaris 9 (master server) and sol10 on client end.

Andy_Welburn
Level 6
Does the job actually 'fail' (error codes?) or just continue ad infinitum?


@Tim walton wrote:

And now I pick off a sub directory of my backup and I submit the restore.
 


I know in some instances, especially when selecting individual files or folders from a large backup, we have had occasions where it takes some time to actually 'search' for those images requested. If this was the case to test you could actually try restoring everything so as to get an 'immediate response' - you can always cancel the restore once it has done a few bytes.




jim_dalton
Level 6
Andy...I'll give your suggestion a try: it does as you guess not actually fail, its just not done anything and in 45 mins that to me looks like its simply not functioning correctly. A normal restore gives much more feedback as you are probably aware.
 
I'll get back to you on this: thanks for your input.
 
Tim

Andy_Welburn
Level 6
Found this, don't know how relevant it may be: (does relate to NetBackup Enterprise Server 5.0 MP5)

Document ID:
279686

GENERAL ERROR: Restores hang for an hour or more while bpdbm polls every client name in the images directory
http://support.veritas.com/docs/279686




jim_dalton
Level 6
Andy: you were correct.
If I select the whole of my backup then it springs into life immediately, so thanks for the tip.
This solves my immediate problem but leaves me with an issue  sinceI can't be having to restore more than I want / or to wait for an hour (or more?) to restore 300 mb of data in order to get netb to perform in a timely manner - my call with symantec remains open, but the cause is isolated.
 
Many thanks,Tim

jim_dalton
Level 6
To close, I followed your other link andy: Im already familiar with this having done a proper DR test and seen netbackup grind to a halt. Theres a few (other) solutions to that problem: get yourself dns up and running asap, remove dns altogether and populate /etc/hosts...depends on how many hosts youre talking about.Fortunately whilst we have quite a few, by the time my dns is up , so is may netbackup server so host resolution is working. Adding aliases is also a neat trick that can be exploited.
 
Thanks,Tim