cancel
Showing results for 
Search instead for 
Did you mean: 

Peculiar dag / db restore situation

jim_dalton
Level 6

We've done restores many a time. But this time its different...the restore job ends, the data have been read, the tape dismounts. But the job doesnt complete, it sits there saying return status 0 in the job but sits in "Active" status in activity monitor.

We use the same restore procedure as we have since the year dot...follow the steps.

One difference is the restore is to a passive node in the dag, and its local passive partner has severe disk issues...the controller has died. My exchange admin assures me it is not relevant since the restore isnt to a dag database by which I assume he means its a database only on the restore target node.

NB7604 Solaris master->windows server dag.

Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions

@jim_dalton If the node you restore to have disk issues, it can be the log apply on the database the is taking long time. As far as I remember you can see the log apply/restore in the event log on the target exchange server.

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

View solution in original post

7 REPLIES 7

Lowell_Palecek
Level 6
Employee

To clarify, a DAG doesn't have active and passive nodes. Databases have active an passive copies. It seems the database in question has only an active copy, on the target server.

Actually, since it's a DAG, bprd chooses the server on which to restore the database. It chooses the server with the active copy of the database. You will observe that bpresolver executes for this purpose, and in the bprd log you'll find a process within a process.

For the job to end, tar32 sends status to bpbrm, which forwards it to the child bprd process. The child bprd process then sends status to the parent process.

Logs to consider then are the tar log on the client, bpbrm on the media server, and bprd on the master server.

Are all of the systems involved on the same NetBackup version?

Active passive nodes versus dbs: point taken. But for the most part the copies are passive so i call em passive nodes.

The restore is to a restore db not the original  which we create and select as part of restore process.

The target client is a little down-rev compared to the others (7504) but it has been functional for as long as I can remember. Rerun and logging seems like our only option.

 

Thanks,Jim

One more question. Are you launching the restore from a GUI, or via bprestore? There was an issue in 7.6, fixed in 7.6.1 with bprestore -w on a DAG database. The issue was that the -w option wasn't honored and bprestore returned before the restoration was done. The solution would be to wait until the job is truly done, or to upgrade the master server.

sdo
Moderator
Moderator
Partner    VIP    Certified

Jim - hate to give you the bad news, but... in this particular instance mixed revs of NetBackup Client, is... AFAIK... officially unsupported.  I get the impression that Veritas try to support as many mixed application-to-aplication versions as they can, but as we know, Veritas' own hands are tied quite often by other vendors.  Sometimes, and we never really know which... it is because either... 1) other vendors own prohibitions... or 2) because they simply don't have the resources to test an almost unlimited number of mixed scenarios.

Plus we haven't checked whether the MS Exchange version of the backup is supported by the down rev NetBackup Client version, have we?

IMO, I think your best bet is to update the down rev target restore client to the same NetBackup Client version that was originally used in the backup... and try again?  Fingers crossed for you.

I don't think it's that bleak. I often do Exchange backups and restores with the client NetBackup version behind my master/media server. Also, I often upgrade my client between backup and restore. (Now, if your target restore client is below the version that made the backup, then we wouldn't support it. It could work, but that's not something we would test. I have my doubts whether that's your problem here. I expect the issue is on the master server.)

I'd like to know how the restore is being invoked - Windows GUI, Java console, command line. How long does the job persist in the Activity Monitor? Does tar32 on the client complete, and the job then stays in the Activity Monitor?

Can you get a level 5 log of bprd? Look for BPRESTORE in the log. Capture everything from the beginning of the parent pid to the end of it. I'm interested in how the parent pid communicates with the child pid.

(Per request previously posted in this forum, don't post a file named xxxx.log.)

@jim_dalton If the node you restore to have disk issues, it can be the log apply on the database the is taking long time. As far as I remember you can see the log apply/restore in the event log on the target exchange server.

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

Kudos to Michael...although all the other input indeed also good to know but the log was showing us what was happening and it was just doing what it needed to do. Restore successful , db mounted and running.