10-08-2012 09:04 PM
Hi sir
now i have a problem with EV 2007 and NetApp Filer , my environment as following
EV and NBU install on the same server : Windows 2003 Enterprice (32bit)
Enterprise Vault Version : EV 2007 SP5
NetBackup : NBU 6.5 MP5
NetApp OnTap : 7.2.5
Description :
I use enterprise vault 2007 FSA to archive file on Netapp storage and use NBU to migrate arcived file to Tape , and leave shortcut (placeholder) on the netapp filer . now I try to retrieve files from shortcut but failed , someone can help me ? thanks
Solved! Go to Solution.
11-02-2012 05:23 PM
Hi Arthur,
Well the good news is that you are running the versions of XBSA.dll and NBUMigrator.dll which are newer than the ones containing the bugs Rob and myself were talking about. However that doesn't mean you can't exhaust the system of resources.
Fundamentally there shouldn't be any difference between how the items are being recalled via FSAUtility or by the placeholder service. Both methods will be using storageofflineopns to retrieve the file, and in both cases the vault service account should be the account used.
I checked through the traces again to see if there was anything else that could tell us what is going wrong, but all i can see is this:
11:53:56.272 [3744.6812] <4> sendRequest: calling get_bprdESwto(): sockfd=<2604>, timeout=<0>
11:53:56.772 [3744.6812] <16> sendRequest: ERR - returned status is <130>: <system error occurred>
11:53:56.772 [3744.6812] <8> sendRequest: CHECK the following progress info of the restore for any error messages:
11:54:06.773 [3744.6812] <4> readCommMessages: Entering readCommMessages
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 INF - Server status = 130>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 Status of restore from image created 08/21/12 10:09:06 = system error occurred>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 The following files/folders were not restored:>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 UTF - /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:57 (283137.xxx) INF - Status = system error occurred.>
11:54:11.773 [3744.6812] <4> logCommFile: done reading comm file
11:54:11.773 [3744.6812] <16> VxBSABeginF leListenterie: ERR - restore failed
11:54:11.773 [3744.6812] <4> VxBSAGetEnv: INF - returning - DataStore
11:54:11.773 [3744.6812] <16> VxBSABeginFileListRestore: ERR - restore failed
11:54:11.773 [3744.6812] <8> : -Wrn- 001-014051.001 XBSA_RetrieveFile:Premature exit of while loop, step 3, status = 3
The image that is associated with that restore is
RestoreFileObjects: INF - restoring File:<E:\Veritas\NetBackup\Logs\user_ops\dbext\logs\3744.1350014033.image>
See if you can restore this backup image - it should give you /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB
The same log contains a successful restore (issued on its own), while this failure happens at exactly the same millisecond as another thread, which is also failing to restore. So I can see *some* things are successfully restored.
I think your errors are likely coming from the NBU side though, so this is where you should consider raising the support case.
Regards,
Jeff
10-09-2012 12:39 AM
is there any errors on the EV Server?
How big is the file? do any files retrieve or do they all fail?
when you say "failed" are you actually seeing any errors? if you wait a while, does the item get retrieved?
Do you see the retrieval request being made in NBU?
has it ever worked?
you may want to get a dtrace of StorageOfflineOpns and see if it gives any clues
10-09-2012 05:26 PM
Also, look at the following whitepaper - there is a fair amount on how to collect logs to troubleshoot the recall process:
http://www.symantec.com/docs/TECH70427
Regards,
Jeff
10-21-2012 11:28 PM
Hi Sir
When I copy the short_cut to another path , i can't see any restore job in nbu , but i can use fsautility to restore one file in a directory , i attache the log , please help me check , thanks
10-23-2012 02:28 AM
Well, you are getting to the point where the request is being passed off to NBU, but then it is failing:
488 11:54:11.835 [3744] (EVStgOfflineOpns) <9608> EV:H NBU Migrator: *Err* 000-014051.014 RetrieveFile API:Failed to retrieve file (error 00000003: System detected error, operation aborted.). Check the NBU Activity Monitor.
489 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf04 001-014051.015 XBSA_MThread :Thread has exited, rtn = 00000003
490 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.016 PrintStrings :0405EE40 Returned from XBSA_RetrieveFile()
491 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.017 PrintStrings : 0 = EV_Policy=EV_Default_Policy_VT1
492 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.018 PrintStrings : 1 = EV_FileId=B01F57FF354CFE478380637DD4767CD3+cgmhev.kaohsung.com+01345514946+2012\08\13\Collection133627.CAB
493 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.019 PrintStrings : 2 = EV_FilePath=H:\Enterprise Vault Stores\VT1 Ptn1\2012\08\13\Collection133627.ARCHCAB
494 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.020 PrintStrings : 3 = NULL
495 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.021 PrintStrings : 4 = Error=00000003: System detected error, operation aborted.
496 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:L NBU Migrator: Inf05 001-014051.022 PrintStrings : 5 = NULL
497 11:54:11.835 [3744] (EVStgOfflineOpns) <2392> EV:H NBU Migrator: *Err* 001-014051.023 RetrieveFile API:Failed to retrieve file (error 00000003: System detected error, operation aborted.). Check the NBU Activity Monitor.
498 11:54:11.851 [3744] (EVStgOfflineOpns) <9608> EV~E Event ID: 6954 The 3rd party Migrator application 'NBU Migrator' has logged the following message: |Failed to retrieve file (error 00000003: System detected error, operation aborted.). Check the NBU Activity Monitor. |
499 11:54:11.851 [3744] (EVStgOfflineOpns) <9608> EV:L NBU Migrator: Inf06 000-014051.024 RetrieveFile API:Exiting, rtn = 80004005
But from looking at this log, and then looking at xbsa log i can see that you are creating hundreds of migration jobs. You are migrating lots of cab files after a very short time period, and that has been known to cause resource issues in the past because of the number of jobs backed up. This isn't a very good strategy unless you are never going to recall those files again.
When you look in NBU activity monitor are lots of jobs listed? Can you retry the failed ones?
Does this work correctly when there are a low number of jobs, or does it ALWAYS work with FSAUtility and fail with a shortcut copy?
Regards,
Jeff
10-23-2012 07:57 PM
Hi Jeff
when I tried to copy the archived file to another location , I can't find any restore job in nbu activity monitor . so i failed with copy short cut to another location .
the log you see is i use fsautiltity to restore archived file and it is strange , When I use fsautility to restore , i can only restore one file in the directoty , but in nbu activity monitor i could find there are restore jobs running , and some job was failed . or you can tell me how to do , thanks !!!
10-24-2012 01:23 AM
I vaguely recall quite a few bugs in this recalling-from-NBU area in EV 2007. Has this issue been discussed with Symantec Support?
10-24-2012 05:18 PM
Yes, there were a fair few bugs, but If I recall some of the code to clear up the error code '3' which was due to open file handles was already included by 6.5.5 and the other stuff for the time bracketting appears to be there because the time brackets are fairly small in the logs.
Anyway, Arthur, I think you also need to provide a trace of what happens when you copy a file, but to be honest you are still suffering from the huge amount of activity that you are pushing through to NBU and that is only going to change with a modification to your migration strategy. How many jobs are you seeing queued up at a time?
Can you also find out which version of XBSA.dll and NBUMigrator.dll you are using on the EV server?
You should be raising a support call at this point though - the question is, should that be with EV, or NBU. You need to check your NBU activity monitor and look at the failures (and check if they are persistent failures) to make that decision.
Regards,
Jeff
10-31-2012 06:05 PM
Hi Jeff
fisrt thanks your reply . I check the xbsa.dll and nbumigrator version as following
xbsa.dll 6.5.2009.1105
nbumigrator.dll 6.0.0.11
thanks .
11-02-2012 05:23 PM
Hi Arthur,
Well the good news is that you are running the versions of XBSA.dll and NBUMigrator.dll which are newer than the ones containing the bugs Rob and myself were talking about. However that doesn't mean you can't exhaust the system of resources.
Fundamentally there shouldn't be any difference between how the items are being recalled via FSAUtility or by the placeholder service. Both methods will be using storageofflineopns to retrieve the file, and in both cases the vault service account should be the account used.
I checked through the traces again to see if there was anything else that could tell us what is going wrong, but all i can see is this:
11:53:56.272 [3744.6812] <4> sendRequest: calling get_bprdESwto(): sockfd=<2604>, timeout=<0>
11:53:56.772 [3744.6812] <16> sendRequest: ERR - returned status is <130>: <system error occurred>
11:53:56.772 [3744.6812] <8> sendRequest: CHECK the following progress info of the restore for any error messages:
11:54:06.773 [3744.6812] <4> readCommMessages: Entering readCommMessages
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 INF - Server status = 130>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 Status of restore from image created 08/21/12 10:09:06 = system error occurred>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 The following files/folders were not restored:>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:56 UTF - /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB>
11:54:11.773 [3744.6812] <4> logCommFile: <11:53:57 (283137.xxx) INF - Status = system error occurred.>
11:54:11.773 [3744.6812] <4> logCommFile: done reading comm file
11:54:11.773 [3744.6812] <16> VxBSABeginF leListenterie: ERR - restore failed
11:54:11.773 [3744.6812] <4> VxBSAGetEnv: INF - returning - DataStore
11:54:11.773 [3744.6812] <16> VxBSABeginFileListRestore: ERR - restore failed
11:54:11.773 [3744.6812] <8> : -Wrn- 001-014051.001 XBSA_RetrieveFile:Premature exit of while loop, step 3, status = 3
The image that is associated with that restore is
RestoreFileObjects: INF - restoring File:<E:\Veritas\NetBackup\Logs\user_ops\dbext\logs\3744.1350014033.image>
See if you can restore this backup image - it should give you /H/Enterprise Vault Stores/VT1 Ptn1/2012/08/13/Collection133627.CAB
The same log contains a successful restore (issued on its own), while this failure happens at exactly the same millisecond as another thread, which is also failing to restore. So I can see *some* things are successfully restored.
I think your errors are likely coming from the NBU side though, so this is where you should consider raising the support case.
Regards,
Jeff