cancel
Showing results for 
Search instead for 
Did you mean: 

Exchange GRT restore issue - NetBackup 7.5.0.6

Systems_Team
Moderator
Moderator
   VIP   

Hi Folks,

Hoping someone can help shed some light on this one:

Master Server: NetBackup 7.5.0.6 (Windows 2008 R2 SP1 64bit)
Media Server: NetBackup 5230 Appliance Version 2.5.3

Exchange: Exchange 2007 SP3 (Windows 2008 R2 SP1 64bit)


I have a VMware policy configured using application protection for Exchange.  This completes successfully, but when trying to do a GRT restore I can expand the Information Store and Storage group with no problems.  Attempting to expand the Database however eventually times out and gives the dreaded "ERROR: database system error".

I have been over & over the requirements for Exchange GRT and am sure I have everything correct.  To prove it, I created a standard MS-Exchange policy with the "Perform snapshot backups" and "Enable granular recovery" options turned on.  This backs up fine (although via LAN - I want be going via SAN and VMware), and I am able to expand all items and go down to individual messages.  This seems to indicate I have everything set correctly for GRT restores.

Does anyone have any ideas as to why this might be?  I've been banging my head on this one for a little while now.  I'd much rather use the VMware policy type with Exchange protection so I get the SAN performance, but also want granular recovery.

Many thanks in advance,

Steve

 

1 ACCEPTED SOLUTION

Accepted Solutions

Systems_Team
Moderator
Moderator
   VIP   

Hi Folks,

Just thought I'd put in an update about what my particular issue was in this case - very interesting.  Was working it through with support when we found it.  Moderators - you can lock this thread afterwards if you like :)   Thx, Steve


We found the answer to this one by accident.  One of my network engineers came to me about a week ago and said he had some suspicious traffic between two devices (the appliance and Exchange server for this case).  He asked if I would check it out for him.

The data for this NetBackup environment traverses a Fortigate firewall/security appliance with IPS/IDS.  I had already checked all possibilities while setting up and trying to troubleshoot the GRT browse issue - so I was aware this firewall was in place, and we had already confirmed (multiple times) that all required ports were open.  What I wasn't aware of was that the IDS/IPS was scanning all traffic for vulnerabilities, and where detected engaging an active response (ie: drop traffic).  The Fortigate was not set up to send automated alerts when something was detected, so it was only when our network engineers logged in to the device for a check that this was discovered.

The vulnerability as detected by the Fortigate device was: http://www.fortiguard.com/encyclopedia/vulnerability/#id=34667

The Microsoft write-up of this issue from the Windows side is:  http://technet.microsoft.com/en-us/security/bulletin/ms13-014

Once we were aware of this, a rule was put in place to ignore traffic between the appliance and Exchange server - GRT browsing then started working.  I had some additional issues in getting the restore to actually complete, but that was a minor issue in which the client was attempting to contact the appliance using a short DNS name.  A minor tweak and having it use the FQDN resolved that issue and it now works end to end :)   We had a similar issue with our production site (although this was throwing a File I/O error rather than Database error) - the same ignore rule was applied at the production site and GRT browsing worked there as well.

I reviewed the Microsoft KB article, and although I could not see the specific patch installed in either Exchange server, judging by the release date I would suspect that this fix has already been applied in a later security rollup update.  I reviewed CVE vulnerabilities for NetBackup software and appliances and could find nothing relating to this.  Also of interest is that during the initial part of this issue I did have Symantec Endpoint Security installed on the Exchange server, and this had reported nothing.

The previous engineer on this case before yourself pointed out in the NCFLBC logs that we were seeing issues with renaming files.  I believe this is where the DOS attack that the Fortigate detected was coming from, as if you read the detail of the Microsoft KB it talks about attempting to rename a file on a read-only file system causing the issue described.

A couple of other points of interest: During an Active Directory GRT browse, and also during a traditional Exchange Snapshot (non-VMware) GRT browse, this issue was not evident.  It was only via a VMware Exchange GRT browse that an active response was triggered on the Fortigate appliance.  I'm not sure if you want to pass this info on to any of your other teams, as I guess it is quite possible any IDS/IPS system scanning and responding to this vulnerability could create the same scenario.

View solution in original post

24 REPLIES 24

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

Make sure you open the BAR gui with "run as administrator"

 

See if that works, before we continue.

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

by any chance did you look into below error tech note

http://www.symantec.com/business/support/index?page=content&id=TECH182930

Issue

When Browsing/Restoring data from an Exchange or Sharepoint VMware Application backup, database system error can occur when you have two VMDKs with the same name.

Cause

This issue occur because they are multiple VMDKs on the virtual machine with the same name, and different paths.

 

Systems_Team
Moderator
Moderator
   VIP   

Hi Riaan / Nagalla,

Thanks for your replies.

@Riann - yes I have tried Run as Administrator, unfortunately still the same issue

@Nagalla - yes I have checked this TN.  There are three VMDK's for this Exchange server, all in the same datastore, but different file names:

lwsubmail01_vmdk

lwsubmail01_1.vmdk

lwsubmail01_2.vmdk

So I don't think this is the issue.

Many Thanks,

Steve

 

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

may be its time to look into the logs, please enable the ncflbc and attach the log output by vxlogview for ncflbc

Systems_Team
Moderator
Moderator
   VIP   

Hi Nagalla,

I'm attaching the latest ncflbc from the problem client.  Near the end of the file is this error:

0xE0009741:Failed to mount one or more virtual disk images. Restores that use Granular Recovery Technology may not be available from this backup set.)

I have done some research for this, but nearly all the links I find are related to Backup Exec and not NetBackup.

I'm leaving the office now, but will check in tonight to see if there are further comments.

Many Thanks,

Steve

Marianne
Level 6
Partner    VIP    Accredited Certified

What is the fragment size in the Storage Unit config?

The default 1TB fragment size is too big when it comes to GRT and causes issues/timeouts when browsing for restores.

A 5000 MB fragment size is recommended.

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

You are restoring from a full right?

Systems_Team
Moderator
Moderator
   VIP   

Hi Folks,

@Marianne - Thanks for that bit of info.  I had read a couple of TN's and posts regarding that, but as the system had just been built by a third part I wasn't game to change that until I had some more info.  I've had a look at the maximum fragment size for the storage untit, and it is currently set to 40960 megabytes.  Are there any negative implications in changing the fragment size?  There are two environments (Primary & DR) which perform their backups locally and then replicate to the alternate site.  I'll change the fragment size shortly, run a new Full and try a GRT restore and give you an update.

@Riann - Yes Riann, restoring from a Full.  When the system was originally built and I had to do some work to try and fix the VM Exchange policies I found incrementals weren't supported so I modified what would have been my incremental dailys to be Full dailys.

Thx,

Steve

Systems_Team
Moderator
Moderator
   VIP   

I changed the maximum fragment size on the disk storage units from 40960 megabytes down to 5000 megabytes as per Marianne's suggestion and then ran another full backup and attempted a restore.  Unfortunately the result was the same (Database error).

What is interesting is that even with the maximum fragment size set at 40960 megabytes, a traditional MS-Exchange backup using snaphot backup and GRT can be browsed down to the email message level with no problems whatsoever - only when using VMware policies to protect Exchange does this issue happen.

Should I leave my disk storage units set at 5000MB?

Thx,

Steve

Systems_Team
Moderator
Moderator
   VIP   

Anyone have any other ideas?  If not I guess I'll have to take this one off to support.

Ziggy1941
Level 4
Partner Accredited

I have the same error (VMware policy and Exchange 2010):

0,51216,158,351,50,1389936999754,4252,10192,0:,203:FS_AttachToDLE() Failed! (0xE0009741:Fa
iled to mount one or more virtual disk images. Restores that use Granular Recovery Technol
ogy may not be available from this backup set.) (../ResourceChild.cpp:667),28:ResourceChil
dBEDS::_attach(),1
 
And as you said all works fine with traditional MS-Exchange policy.
 
I have no ideas too (((

Bart_S_
Level 3
Partner Accredited

I have similar problem, but with MS-Exchange policy. Backups run fine, but I can't browse emails in BAR gui.

I thought the reason was Client for NFS service (
http://www.symantec.com/docs/TECH205533), but service looks ok (server has been rebooted too).

Funny thing is that I backup 8 Exchange DBs with 2 media servers (they are MBX servers), so DBs are backuped locally and send using OST plugin to Data Domain over IP. The first MBX backups 4 DBs, the second other 4. It possible to browse mails only from databases which are backuped by the first MBX - media server. All DBs backuped by the second MBX - media server have 
"ERROR: database system error" in BAR gui... Two MBX - media servers are configured exactly the same way.

Bart_S_
Level 3
Partner Accredited

What's more I found on affected media server (the second MBX) such entries:

ncfnbbrowse logs:
 

0,51216,361,359,1,1390404814083,7680,11756,0:,64:Unable to get key value DB2PATH, error 2 (../Db2Registry.cpp:76),25:Db2Registry::addInstances,1
0,51216,361,359,2,1390404814083,7680,11756,0:,58:Failed to open db2 registry entry. (../Db2Registry.cpp:95),25:Db2Registry::addInstances,1
0,51216,361,359,3,1390404814083,7680,11756,0:,52:Unable to retrieve DB2 instances (../Db2Api.cpp:586),20:Db2Api::getInstances,1
2,51216,361,359,4,1390404814083,7680,11756,0:,0:,0:,0,(12|)
0,51216,309,359,117,1390404814083,7680,11756,0:,96:Processing ProcessDebug:nbbrowse; using the Window's registry (../DefaultConfigProvider.cpp:341),40:DefaultConfigProvider::getValueNonBpConf,1
1,51216,309,359,118,1390404814083,7680,11756,0:,0:,40:DefaultConfigProvider::getValueNonBpConf,1,(36|A20:\HKEY_LOCAL_MACHINE\|A61:SOFTWARE\Veritas\NetBackup\CurrentVersion\Config\ProcessDebug|A8:nbbrowse|D32:2|)
0,51216,309,359,119,1390404814114,7680,11756,0:,46:Failed to sync buffers (../NcfStreams.cpp:238),45:AceHandleStreamBuffer::~AceHandleStreamBuffer,1

winbargui logs:

0,51216,264,264,3,1390404821491,6716,12164,0:,37:Could not open the file. Line No: 133,22:CBPCDFile::ReadFile( ),1
0,51216,264,264,4,1390404821491,6716,12164,0:,39:Flat File Buffer is Empty! Line No: 484,31:CFlatFileParser::GetKeyValue( ),1
0,51216,264,264,5,1390404821491,6716,12164,0:,39:Flat File Buffer is Empty! Line No: 484,31:CFlatFileParser::GetKeyValue( ),1
0,51216,264,264,6,1390404821491,6716,12164,0:,39:Flat File Buffer is Empty! Line No: 484,31:CFlatFileParser::GetKeyValue( ),1

It doesn't matter if I start restore from master server or media server/client. Always there is a "ERROR: database system error" in BAR GUI.

Any help? Thx in advance :)

 

rawilde
Level 4
Employee

I took a look at the ncflbc log you uploaded and to expose the VMware VDDK traces, you will need to increase verbosity and upload a new log.

On the target client (Exchange server where browse is taking place), you can use the BAR gui to set general=2 and verbose=5.

 

Also see Ziggy1941 solution for a similar issue here:

https://www-secure.symantec.com/connect/forums/exchange-2010-grt-error-restore#/comment-9717601

 

Let me know the results and/or upload a new log with more tracing.

rawilde
Level 4
Employee

Bart

 

See my comment above (in this orifinal topic) about creating a more verbose ncflbc log.  If you can collect that log and upload it (to the forum or directly to me) I can take a look at it.

Systems_Team
Moderator
Moderator
   VIP   

Hi Rawilde,

Looks like Bart has hijacked my thread :p

I raised a case with Support yesterday but so far have had no response other than the case has been logged.  If you have time would you mind having a quick look at mine?  I'm guessing you will be faster than waiting for Support to respond :D

Problem-NCFLBC is a verbose log from an attempted GRT restore done from a VMware policy which fails with the database error.  GOOD-NBFLBC is just for refernce - it is an attempted GRT restore from a standard Exchange Snapshot GRT backup which has no problems.

Many Thanks,

Steve

Bart_S_
Level 3
Partner Accredited

Hi Rawilde and everyone,

I'm sending ncflbc logs from MBX - media server which Rawilde asked for. What's more yesterday I opened technical case with Symantec Support, but they haven't helped me yet.

I've tried to increase client connect and read time on master and media servers to 900 secs, but still the same error.


Thanks in advance,
Bart

rawilde
Level 4
Employee

I will take a look today.

rawilde
Level 4
Employee

Bart

 

I took a look and looks like an NFS issue....I makred the key error lines in bold.

I would look at the logs\nbfsd log on the client.  This is where the networking wmnet and NFS communication stuff is logged.  You may want to check on the health of the "client for NFS" on this server and possible even re-install it.

Otherwise, it could be an issue on the media server NBFSd accepting the communications, but I would guess its a client side NFS issue.

 

Take a look and let me know what you see.

 

Exceprt from your log:

1/24/2014 04:20:20.156 [LBCCallback::nbfs_setup] Timeout: 3600 (../LBCCallback.cpp:431)
1/24/2014 04:20:20.156 V-309-28 [nbfs_mount()] INF - waiting for mount mutex
1/24/2014 04:20:20.187 V-309-28 [nbfs_mount()] INF - running command - "C:\Program Files\Veritas\NetBackup\bin\nbfs" mount -server vm8r2mbxsok.nbpdom.win -port 7394 -timeout 60 -retry 11 -cred F45BB2A024...
1/24/2014 04:20:42.470 [BRMObserverDepreciated::write] Sending to bpbrm:  (../BRMObserverDepreciated.cpp:359)
1/24/2014 04:20:42.470 [BufferManager::releaseBuffer()] Released Buffer (0x0000000001b8b160 ) containing (1) bytes of data. (../BufferManager.cpp:508)
1/24/2014 04:20:42.470 [BufferManager::releaseBuffer()] Released Buffer (0x0000000001b8b160 ). (../BufferManager.cpp:514)
1/24/2014 04:21:04.472 [BRMObserverDepreciated::write] Sending to bpbrm:  (../BRMObserverDepreciated.cpp:359)
1/24/2014 04:21:04.472 [BufferManager::releaseBuffer()] Released Buffer (0x0000000001b8b0c0 ) containing (1) bytes of data. (../BufferManager.cpp:508)
1/24/2014 04:21:04.472 [BufferManager::releaseBuffer()] Released Buffer (0x0000000001b8b0c0 ). (../BufferManager.cpp:514)
1/24/2014 04:21:26.474 [BRMObserverDepreciated::write] Sending to bpbrm:  (../BRMObserverDepreciated.cpp:359)
1/24/2014 04:21:26.474 [BufferManager::releaseBuffer()] Released Buffer (0x0000000001b8b160 ) containing (1) bytes of data. (../BufferManager.cpp:508)
1/24/2014 04:21:26.474 [BufferManager::releaseBuffer()] Released Buffer (0x0000000001b8b160 ). (../BufferManager.cpp:514)
1/24/2014 04:21:33.490 V-309-28 [nbfs_mount()] INF - response - EXIT_STATUS=25
1/24/2014 04:21:33.631 V-309-28 [nbfs_mount()] ERR - timeout period exceeded
1/24/2014 04:21:33.631 V-309-28 [nbfs_mount()] ERR - nothing is mounted
1/24/2014 04:21:33.631 [LBCCallback::nbfs_setup] nbfs_mount() failed: %d11 (../LBCCallback.cpp:449)
1/24/2014 04:21:33.631 [Error] V-351-5 failed to initialize nbfs
1/24/2014 04:21:33.647 [Error] V-351-5 failed to initialize nbfs