02-17-2014 08:43 AM
Hi,.
I am using RHEL Master server in two locations, one in London (main DC) and one in Croydon (DR Site) both are running NBU7.5.0.7, the Croydon site has a dual boot disk so I can bring this up with the same details as the main DC Master except for the IP.
Backups are taken using the HP B6200 StoreOnce D2D system using OST and uses SLP's to create duplicates of the data between both sites, this includes the catalog.
Having stopped NBU and downed the DC Master and starting the DR server as the DC and configuring the Storage Servers/Disk Pools & Storage Units I get some issues when importing the catalog, when I run nbcatsync –sync_dr_file <disaster recovery file> this returns an error 13 message, not sure why...also when I run the bprecover -wizard -copy 2 this appears to work, it creates all the policies, clients, devices etc but when I try a restore nothing is shown, I then look in the catalog GUI for any entries for the StoreOnce backups it only retrieves the catalog backup, even if I select copy 2, I have tried to make this as the primary and import but I get the same response, strange thing is all the images can be located in the db folder, it is as if the database is not being updated correctly.
I have even tried the following:
bpimport -create_db_info -stype hp-StoreOnceCatalyst -dp BFB-HPUX-REPSTORE -dv BFB-HPUX-REPSTORE
bpimport -create_db_info -stype hp-StoreOnceCatalyst -dp BFB-HPUX-REPSTORE -dv BFB-HPUX-REPSTORE
cat_export -client bfbackup.ipcmedia.com
Go to the following directory to find the DR image file
Hot_Catalog_Backup_1392595286_FULL:
/usr/openv/netbackup/db.export/images/bfbackup.ipcmedia.com/1392000000
Open Hot_Catalog_Backup_1392595286_FULL file and find the BACKUP_ID
(for example: bfbackup.ipcmedia.com_1392595286).
bpimport [-server name] -backupid bfbackup.ipcmedia.com_1392595286
bprestore -T -w [-L progress_log] -C bfbackup.ipcmedia.com -t 35 -p Hot_Catalog_Backup -X -s 1392595286 -e 1392595286 /
Not sure what they mean about this bit --- Run the BAR user interface to restore the remaining image database
Again this seems to create the images in the directories and all the policy info but I cannot see any of the data from the Catalog or Restore GUI
Any help would be appreciated
Obviously this would be easier when/if HP start getting AIR working
Kev
Solved! Go to Solution.
02-20-2014 10:40 AM
Hi Kevin,
nbdevquery gives the media id
nbdevquery -stype hp-StorO -listdv
nbdevquery -stype hp-StorO -listdp
(You might need to put in the storage server name and perhaps a -L or -U t get the right output format (apologies, don;t have aa machine to hand to try it on).
If you've run the full catalog recovery, it should have at the very least restored enough to allow manual restore of the rest (which under certain conditions is as per the manual).
The files you are after restoring are in /usr/openv/netbackup/db/images so you use the BAR GUI as normal to serach these backups, with the exception that you have to set the policy type you are browsing to nbu-catalog (else nothing will be found).
If you look under ...db/images/<master name> you should see some dirs eg . 1391000000 1392000000 etc ... these should contain the catalog info for the catalog backups, and are what you want to browse, to allow the restore of the other catalog files (of non catalog backups). You would restore eveything in db/images back to where it came from.
Regarding nbcatsync, I'd have to check which logs it writes to for certain, but I'd start with admin and bpdbm.
Martin
02-18-2014 02:11 AM
These issues can be a bit difficult to understand - I will freely admit that the tshoting guide is not exactly easy to follow at times, for exaple 'using the BAR GUI to recover the remaining images' ... but perhaps not making it immediately obvious as to when this is the case.
nbcatsync status 13 - not sure whats going on there ... lets see if we can progress without it, which may or may not work.
Couple of ways this could be done, first, the hard way ....
The HP device will have a mediaid like this @aaaab or @aaaac - this can be seen with nbdevquery command, or simply by loooking in the DR file.
Lets for example say you have this on the primmary master
@aaaab - <some storage>
@aaaac - copy1 of catalog
@aaaad - copy2 catalog
You need oto create the DR master with the same disk IDs, in this case @aaaab doesn't matter, we only need the storage holding the catalog, copy 2
If DR side only sees the copy 2 HP device, this will come back as @aaaab, as the media id will be b for the first device configured, c for the next, d for the next and so on ...
If you have mutiple devices on the DR side, simply configure them in the order required to match the production side.
If you do not have enough devices, can you create a couple of advanced disk STUs, these can be used as 'dummy' devices to nudge the media ids in the order you require.
Second, the easy way ...
On the DR side, make a safe copy of the DR file, and then edit the real one to change the media ids to match the DR side. In the DR file, on the DR_MEDIA_REQ_LINES I think the HP device hostname will be shown this may need editing to match the DR side. It's kinda difficult to describe without seeing it, but you get the idea, Of course, if the DR file matches the disk layout on the DR server in the first place, there is no need to run nbcatsync at all.
Regarding the recovery that appears to have bought back only the image data for the catalog backups.
Are you using full and incremental catalog backups ? If you are recovering from an incremental catalog backup then at certain versions of NBU (that escape my mind at the sec) there is a code issue that causes this. A simple workaround is to just run the bprecover -wizard a second time (using the same backupid as the fist time) - this should bring back the entire catalog. This does depend, as mentioned, the version of NBU and if incremental catalog backups have been used.
02-19-2014 02:20 AM
Cheers Martin,
As you say the documentation regarding this is very flakey to follow.
I am taking full catalog backups from the live system.
Will give this a test in the next couple of weeks as I need to get a Change Control authorised to take down the backup servers.
Kev
02-19-2014 05:14 AM
Hi Martin,
This is a sample of the DR file from the Live site:
02-19-2014 09:43 AM
Yes, that's correct.
In essence, you have to change anything unique to the stoarage device to match the storage device on the DR side.
1392663600 @aaaa4 1;hp-StoreOnceCatalyst;10.132
What appears odd here is the media id @aaaa4 - I've only ever seen these contain letters, not numbers. I presume that this matches up with the media ids on the prod side seen in nbdevconfig output.
If 'hp-StoreOnceCatalyst' is the hostname, this will need to match also.
Another question, are you running a full or partial catalog revocery (as selected within the GUI or bprecover -wizard). I think, that if the catalog is replicated, you are only meant to recover using a partial recovery, then restore the remaining files in /usr/openv/netbackup/db/images via the BAR gui (hence the bit you mentioned in the manual).
TBH, it might be easier to log a call for this - having just recently looked at a call along these lines it was much much easier to look at the system direct via webex.
If you are using full backups, ignore my coment about running bprecover twice, this was just a workaround for a code issue that afaik only affected incremental catalog backups.
Still, if you happy to have a go at the DR file, try this first, but the nbcatsync should be working, and is an issue that should really be investigated, as it would make life a lot more simple.
M
02-20-2014 06:24 AM
Hi Martin,
All the catalog backups are Full ones that we take, the 'hp-StoreOnceCatalyst is the Server Type, if I run an nbdevconfig -previewdv -storage_server -stype from the main master server to the storage server on the other site that it uses for the duplicate I get the following
02-20-2014 10:40 AM
Hi Kevin,
nbdevquery gives the media id
nbdevquery -stype hp-StorO -listdv
nbdevquery -stype hp-StorO -listdp
(You might need to put in the storage server name and perhaps a -L or -U t get the right output format (apologies, don;t have aa machine to hand to try it on).
If you've run the full catalog recovery, it should have at the very least restored enough to allow manual restore of the rest (which under certain conditions is as per the manual).
The files you are after restoring are in /usr/openv/netbackup/db/images so you use the BAR GUI as normal to serach these backups, with the exception that you have to set the policy type you are browsing to nbu-catalog (else nothing will be found).
If you look under ...db/images/<master name> you should see some dirs eg . 1391000000 1392000000 etc ... these should contain the catalog info for the catalog backups, and are what you want to browse, to allow the restore of the other catalog files (of non catalog backups). You would restore eveything in db/images back to where it came from.
Regarding nbcatsync, I'd have to check which logs it writes to for certain, but I'd start with admin and bpdbm.
Martin
02-21-2014 03:24 AM
Many thanks again Martin,
I have got the media id's now from the nbdevquery, once I create the Disk Pools etc on the DR site I will check to see what they are listed as and go from there, I will let you know what happens when I run the test a week on Monday
Kev
02-21-2014 03:41 AM
Thanks Kevin,
It is a bit messy to do it this way, but if we can eliminate the need for nbcatsync, we can see if there is anything else causing an issue. The DR / Troubleshooting guide explains catalog recovery in different circumstances and it is not the easiest guide to read - in terms of 'deciding' if the explanation in the guide matches your setup.
As mentioned, I recently went through a simliar(ish) issue, and at first we tought nbcatsync was contributing - but by using this method to eliminate it, we found the issue still exsisted, so that allowed us to move in quicker on the real issue without spending what would probably have been quite a bit of time trying to pick apart exactly what nbcatsync wasdoing and if what it was doing was correct.
Many thanks,
martin
03-03-2014 05:51 AM
I ran the full test this morning but this still failed so I reinstalled Netbackup on the DR server again and ran the nbcatsync for copy 2 and this found the correct disk id's and i was able to import the catalog and perform a restore from one client to another.
Many thanks fro your help with this one, just got to document what I did now.
Kev
03-03-2014 11:42 AM
Hi Kevin,
Thanks for the update.
How strange - looks like something was amiss in the deepest dark workings ...
M