cancel
Showing results for 
Search instead for 
Did you mean: 

Entervault 2007 sp5 FSA can not retrieve file from nbu to netapp

arthurkuo
Level 2
Partner

Hi sir

       now i have a problem with EV 2007 and NetApp Filer , my environment as following 

       EV and NBU install on the same server : Windows 2003 Enterprice (32bit) 

       Enterprise Vault Version : EV 2007 SP5

       NetBackup : NBU 6.5 MP5

       NetApp OnTap : 7.2.5

       Description :

      I use enterprise vault 2007 FSA to archive file on Netapp storage and use NBU to migrate arcived file to Tape , and leave shortcut (placeholder) on the netapp filer . now  I try to  retrieve files from shortcut but failed , someone can help me ? thanks 

1 ACCEPTED SOLUTION

Accepted Solutions

Jeff_Shotton
Level 6
Partner Accredited Certified

Hi Arthur,

Well the good news is that you are running the versions of XBSA.dll and NBUMigrator.dll which are newer than the ones containing the bugs Rob and myself were talking about. However that doesn't mean you can't exhaust the system of resources.

Fundamentally there shouldn't be any difference between how the items are being recalled via FSAUtility or by the placeholder service. Both methods will be using storageofflineopns to retrieve the file, and in both cases the vault service account should be the account used.

I checked through the traces again to see if there was anything else that could tell us what is going wrong, but all i can see is this:

11:53:56.272 [3744.6812] <4> sendRequest: calling get_bprdESwto(): sockfd=<2604>, timeout=<0>
11:53:56.772 [3744.6812] <16> sendRequest: ERR - returned status is <130>: <system error occurred>
11:53:56.772 [3744.6812] <8> sendRequest: CHECK the following progress info of the restore for any error messages:
11:54:06.773 [3744.6812] <4> readCommMessages: Entering readCommMessages
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 INF - Server status = 130>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 Status of restore from image created 08/21/12 10:09:06 = system error occurred>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 The following files/folders were not restored:>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 UTF - /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:57 (283137.xxx) INF - Status = system error occurred.>
11:54:11.773 [3744.6812] <4> logCommFile: done reading comm file
11:54:11.773 [3744.6812] <16> VxBSABeginF leListenterie: ERR - restore failed
11:54:11.773 [3744.6812] <4> VxBSAGetEnv: INF - returning - DataStore
11:54:11.773 [3744.6812] <16> VxBSABeginFileListRestore: ERR - restore failed
11:54:11.773 [3744.6812] <8> : -Wrn- 001-014051.001 XBSA_RetrieveFile:Premature exit of while loop, step 3, status = 3

The image that is associated with that restore is

RestoreFileObjects: INF - restoring File:<E:\Veritas\NetBackup\Logs\user_ops\dbext\logs\3744.1350014033.image>

See if you can restore this backup image - it should give you /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB

The same log contains a successful restore (issued on its own), while this failure happens at exactly the same millisecond as another thread, which is also failing to restore. So I can see *some* things are successfully restored.


I think your errors are likely coming from the NBU side though, so this is where you should consider raising the support case.


Regards,
Jeff

View solution in original post

9 REPLIES 9

JesusWept3
Level 6
Partner Accredited Certified

is there any errors on the EV Server?
How big is the file? do any files retrieve or do they all fail?
when you say "failed" are you actually seeing any errors? if you wait a while, does the item get retrieved?
Do you see the retrieval request being made in NBU?
has it ever worked?
 

you may want to get a dtrace of StorageOfflineOpns and see if it gives any clues

https://www.linkedin.com/in/alex-allen-turl-07370146

Jeff_Shotton
Level 6
Partner Accredited Certified

Also, look at the following whitepaper - there is a fair amount on how to collect logs to troubleshoot the recall process:

http://www.symantec.com/docs/TECH70427

Regards,

Jeff

 

arthurkuo
Level 2
Partner

Hi Sir

        When I copy the short_cut to another path , i can't see any restore job in nbu , but i can use fsautility to restore one file in a directory , i attache the log  , please help me check , thanks 

 

Jeff_Shotton
Level 6
Partner Accredited Certified

Well, you are getting to the point where the request is being passed off to NBU, but then it is failing:

 

488    11:54:11.835     [3744]    (EVStgOfflineOpns)    <9608>    EV:H    NBU Migrator: *Err* 000-014051.014 RetrieveFile API:Failed to retrieve file (error 00000003: System detected error, operation aborted.). Check the NBU Activity Monitor.

489    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf04 001-014051.015 XBSA_MThread    :Thread has exited, rtn =  00000003
490    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.016 PrintStrings    :0405EE40 Returned from XBSA_RetrieveFile()
491    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.017 PrintStrings    :  0 = EV_Policy=EV_Default_Policy_VT1
492    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.018 PrintStrings    :  1 = EV_FileId=B01F57FF354CFE478380637DD4767CD3+cgmhev.kaohsung.com+01345514946+2012\08\13\Collection133627.CAB
493    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.019 PrintStrings    :  2 = EV_FilePath=H:\Enterprise Vault Stores\VT1 Ptn1\2012\08\13\Collection133627.ARCHCAB
494    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.020 PrintStrings    :  3 = NULL
495    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.021 PrintStrings    :  4 = Error=00000003: System detected error, operation aborted.
496    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:L    NBU Migrator: Inf05 001-014051.022 PrintStrings    :  5 = NULL
497    11:54:11.835     [3744]    (EVStgOfflineOpns)    <2392>    EV:H    NBU Migrator: *Err* 001-014051.023 RetrieveFile API:Failed to retrieve file (error 00000003: System detected error, operation aborted.). Check the NBU Activity Monitor.
498    11:54:11.851     [3744]    (EVStgOfflineOpns)    <9608>    EV~E    Event ID: 6954 The 3rd party Migrator application 'NBU Migrator' has logged the following message: |Failed to retrieve file (error 00000003: System detected error, operation aborted.). Check the NBU Activity Monitor. |
499    11:54:11.851     [3744]    (EVStgOfflineOpns)    <9608>    EV:L    NBU Migrator: Inf06 000-014051.024 RetrieveFile API:Exiting, rtn = 80004005

But from looking at this log, and then looking at xbsa log i can see that you are creating hundreds of migration jobs. You are migrating lots of cab files after a very short time period, and that has been known to cause resource issues in the past because of the number of jobs backed up. This isn't a very good strategy unless you are never going to recall those files again.

When you look in NBU activity monitor are lots of jobs listed? Can you retry the failed ones?

Does this work correctly when there are a low number of jobs, or does it ALWAYS work with FSAUtility and fail with a shortcut copy?

Regards,

Jeff

 

 

arthurkuo
Level 2
Partner

Hi Jeff

          when I tried to copy the archived file to another location , I can't find any restore job in nbu activity monitor . so i failed with copy short cut to another location . 

          the log you see is i use fsautiltity to restore archived file  and it is strange , When I use fsautility to restore , i can only restore one file in the directoty , but in nbu activity monitor i could find there are restore jobs running , and some job was failed . or you can tell me how to do , thanks !!!

            

Rob_Wilcox1
Level 6
Partner

I vaguely recall quite a few bugs in this recalling-from-NBU area in EV 2007.  Has this issue been discussed with Symantec Support?

Working for cloudficient.com

Jeff_Shotton
Level 6
Partner Accredited Certified

Yes, there were a fair few bugs, but If I recall some of the code to clear up the error code '3' which was due to open file handles was already included by 6.5.5 and the other stuff for the time bracketting appears to be there because the time brackets are fairly small in the logs.

Anyway, Arthur, I think you also need to provide a trace of what happens when you copy a file, but to be honest you are still suffering from the huge amount of activity that you are pushing through to NBU and that is only going to change with a modification to your migration strategy. How many jobs are you seeing queued up at a time?

Can you also find out which version of XBSA.dll and NBUMigrator.dll you are using on the EV server?

You should be raising a support call at this point though - the question is, should that be with EV, or NBU. You need to check your NBU activity monitor and look at the failures (and check if they are persistent failures) to make that decision.

Regards,

Jeff

 

 

arthurkuo
Level 2
Partner

Hi Jeff

         fisrt thanks your reply . I check the xbsa.dll and nbumigrator version as following

        xbsa.dll 6.5.2009.1105

       nbumigrator.dll 6.0.0.11

        thanks .

Jeff_Shotton
Level 6
Partner Accredited Certified

Hi Arthur,

Well the good news is that you are running the versions of XBSA.dll and NBUMigrator.dll which are newer than the ones containing the bugs Rob and myself were talking about. However that doesn't mean you can't exhaust the system of resources.

Fundamentally there shouldn't be any difference between how the items are being recalled via FSAUtility or by the placeholder service. Both methods will be using storageofflineopns to retrieve the file, and in both cases the vault service account should be the account used.

I checked through the traces again to see if there was anything else that could tell us what is going wrong, but all i can see is this:

11:53:56.272 [3744.6812] <4> sendRequest: calling get_bprdESwto(): sockfd=<2604>, timeout=<0>
11:53:56.772 [3744.6812] <16> sendRequest: ERR - returned status is <130>: <system error occurred>
11:53:56.772 [3744.6812] <8> sendRequest: CHECK the following progress info of the restore for any error messages:
11:54:06.773 [3744.6812] <4> readCommMessages: Entering readCommMessages
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 INF - Server status = 130>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 Status of restore from image created 08/21/12 10:09:06 = system error occurred>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 The following files/folders were not restored:>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 UTF - /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:57 (283137.xxx) INF - Status = system error occurred.>
11:54:11.773 [3744.6812] <4> logCommFile: done reading comm file
11:54:11.773 [3744.6812] <16> VxBSABeginF leListenterie: ERR - restore failed
11:54:11.773 [3744.6812] <4> VxBSAGetEnv: INF - returning - DataStore
11:54:11.773 [3744.6812] <16> VxBSABeginFileListRestore: ERR - restore failed
11:54:11.773 [3744.6812] <8> : -Wrn- 001-014051.001 XBSA_RetrieveFile:Premature exit of while loop, step 3, status = 3

The image that is associated with that restore is

RestoreFileObjects: INF - restoring File:<E:\Veritas\NetBackup\Logs\user_ops\dbext\logs\3744.1350014033.image>

See if you can restore this backup image - it should give you /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB

The same log contains a successful restore (issued on its own), while this failure happens at exactly the same millisecond as another thread, which is also failing to restore. So I can see *some* things are successfully restored.


I think your errors are likely coming from the NBU side though, so this is where you should consider raising the support case.


Regards,
Jeff