06-09-2014 03:42 AM
This is becoming very frustrating now we tend to lose access to FSA when recalling EV files .. FSA is on version 10.0.3 while Enterprise vault is on 10.0.4
we have enabled passthrorugh recall and relly stuck here
The server become responsive onlt after the large file complets which could take an hr
Any ideas on how to sole this ???
Solved! Go to Solution.
07-04-2014 12:01 AM
seems we have stopped the lose of shares by applying Changing SMB2 MaxThreadsPerQueue setting to 64 on all FSA
MaxThreadsPerQueue (HKLM\System\CurrentControlSet\Services\LanmanServer\Parameters\(REG_DWORD)
However we are still getting large recalls even by just higlight jpeg files ..
during a Dtrace on the highlighed Jpegs shows
4891 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M [EvPassthruCacheInit] Queueing a data request| FileName:D:\UK\Pro\11030854 Felpham Bognor\Photos\site visit\B.N.N.R.Felpham August 2010\Picture 024.jpg| keyID : 31
4892 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:L {PassThroughRecallLimiter::PassThroughRecallLimiter} (Entry)
4893 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:L PassThroughRecallLimiter::PassThroughRecallLimiter Caller SID is S-1-5-21-1111383825-1399753330-1979989523-17310
4894 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeName: Trying to get the .exe name for pid: 3796
4895 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeNameUsingPHHelper: entry - PID:3796
4896 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeNameUsingPHHelper: exit - PID:3796, exe name:mcshield.exe
4897 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeName: The .exe name for for pid: 3796 is mcshield.exe
4898 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:L {PassThroughRecallLimiter::PassThroughRecallLimiter} (Exit) Status: [Success]
I have the mcshield.exe in the exculded exe registry on both the placeholder and passthrough entries as we use passthrough recall
06-09-2014 03:47 AM
Always best to have the latest versions, and matching versions. So first of all update to the latest Cumulative Hotfix for 10.0.4.
Then to the problem - does this happen when you recall *one* file, or when you recall hundreds? What sort of controlled testing have you done?
What OS is the FSA Server?
06-09-2014 03:50 AM
I was of the impression windows 2008 FSA have to be on version 10.0.3 as they tend to blue screen on latest Cumulative Hotfix for 10.0.4.
when I kill the place holder service the server becomes responsive and noticed 6 download errors totalling 20GB in size
06-09-2014 03:51 AM
06-09-2014 04:29 AM
Interesting, I wasn't aware of that. Bit 'rubbish' to say to downgrade, in my opinion..
However, for the original post you are on 10.0.3, and still have issues. Therefore I'd recommend discussing it with Symantec Support. Have you tried that?
06-09-2014 04:55 AM
Spoken to Symnect and they keep asking me to downgrade till i dont get issue which i think is a joke
06-09-2014 05:18 AM
Yobole,
Which version of the CHF is installed on the EV server? The latest is CHF3.
Due to changes that were made in the driver it is recommended that the file server be on the latest 10.0.4 CHF3 or an earlier version with pass-through enabled. This avoids issues that were made to the driver at Microsoft's recommedation. Those changes have been rolled back in the 10.0.4 CHF3 version of the driver.
In the earlier version all the volumes that are targets on the file server need to have pass-through recall enabled to avoid the issue with writing to the same location as the original file. It may be that the large file is not using pass-through, but a Dtrace of the recall would be needed to confirm. This can also be an issue if the EV server is very busy causing recalls to wait. Has the server been optimized?
- Recommended steps to optimize performance on Enterprise Vault (EV), Compliance Accelerator (CA), Discovery Accelerator (DA), and SQL Servers in an EV environment http://www.symantec.com/docs/TECH56172
- TCP Chimney, TCPIP Offload Engine (TOE) or TCP Segmentation Offload (TSO) will cause a transport-level error to be logged resulting in inaccurate hit counts for Accelerator searches: http://www.symantec.com/docs/TECH55653
06-09-2014 11:15 AM
Symantec response which I find acceptable. I hace applied the reg key and also configured passthrough recall but I am NOT comfortable downgrading to 9.04.
As discussed, the issue is occurring on Windows 2008 File Servers and currently the FSA Agent is on 10.0.3. I would recommend to downgrade the FSA Agent to 9.0.4 on Windows 2008 File Servers which are having problems. I have seen the server being stable after downgrading the FSA Agent to 9.0.4 and adding the following reg key on the File Servers
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\EvFilter\
DWORD value
Enter the name as IncrementVPBCount
Set the value as 1
If you would not want to downgrade the FSA Agent, then we will have to gather complete memory dumps to analyze what exactly the problem is but at this point of time, there is no know solution. The only workaround is to downgrade the FSA Agent to 9.0.4 and add the reg key.
06-09-2014 12:33 PM
Hi Yobole, There can be few reasons for this behavior. plaudone has already detailed couple of them as what may be the possible reason and also changes we have made in 10.0.4 CHF 3 and what you need to do as first steps. Its strongly recommended to upgrade to 10.0.4 CHF3 version as per what is mentioned above. I had worked with Microsoft in past on such cases and will be interested to know if there are still issues related to hang on 10.0.4 CHF3 version. If it does, we can look over that.
Second, about the behavior, EV Placeholder Service will NOT going to wait\hold until whole file gets downloaded. the download of any file in pass-through mode will be in chunks. Also, having a small cache size will affect the performance as it has a threshold for cleanup. After downloading a file, if cache size reaches the threshold, cleanup is trigged. Cache cleanup will delete least recently used files until enough space is freed for the new file.
Third, EV server performance and busyness will also play role in delay!!!
Once you upgrade and if you still experience the issue, you need do few things:
06-09-2014 01:02 PM
The 10.0.2 or 10.0.3 agent should be sufficient on the file server in that configuration as long as pass-through is configured and working properly. A Dtrace of the Placeholder service would be able to show if pass-through is being utilized when the issue occurs.
06-10-2014 11:02 PM
I have upgraded the EV server and all FSA to EV 10.0.04 CH3 last night
06-10-2014 11:18 PM
Okay, I guess it's a waiting game then now? How often did it used to show this behaviour?
06-10-2014 11:23 PM
normally on monday mornings so we just see how it goes in the next coming days .. I also noticed another FSA making a large number of recalls which might be causing the bottle neck on the master EV server .. cannot seem to find our what is actually making theses recall and I have exclude registry keu polulated with AV and Backup exe ....
However this particular FSA has 3000 orphanned place holders which not sure how to deal with them
06-11-2014 08:16 AM
We have had customers perform that upgrade and the issue was resolved.
You can Dtrace the EVPlacholderService on the file server to determine what is making the recall requests. Will be like the following entry. Then you can exlclude that application from recalling. If it comes back with pid:4 then that is a remote request and EV cannot provide the calling application name.
(EvPlaceholderService) <2984> EV:M WorkItem::GetExeName: The .exe name for for pid: 2696 is Explorer.EXE
For orphaned PH files you can run fsautility -o in report mode to show the orphaned files. You can also delete them with the same command in normal mode.
06-12-2014 12:08 PM
After the upgrade we has to reboot a windows 2012 FSA twice today after the server became unresponsive
Eventlogs shows EV error 7206
A locking error has occurred in FileAllocEntry : Too many posts were made to a semaphore. (0x12a)
Internal reference: RERL Release Sema4 %4
06-16-2014 08:27 AM
Has it been determined if there is an application recalling files?
Would need more info from file server like Event logs and Dtrace to determine what is occurring.
07-04-2014 12:01 AM
seems we have stopped the lose of shares by applying Changing SMB2 MaxThreadsPerQueue setting to 64 on all FSA
MaxThreadsPerQueue (HKLM\System\CurrentControlSet\Services\LanmanServer\Parameters\(REG_DWORD)
However we are still getting large recalls even by just higlight jpeg files ..
during a Dtrace on the highlighed Jpegs shows
4891 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M [EvPassthruCacheInit] Queueing a data request| FileName:D:\UK\Pro\11030854 Felpham Bognor\Photos\site visit\B.N.N.R.Felpham August 2010\Picture 024.jpg| keyID : 31
4892 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:L {PassThroughRecallLimiter::PassThroughRecallLimiter} (Entry)
4893 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:L PassThroughRecallLimiter::PassThroughRecallLimiter Caller SID is S-1-5-21-1111383825-1399753330-1979989523-17310
4894 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeName: Trying to get the .exe name for pid: 3796
4895 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeNameUsingPHHelper: entry - PID:3796
4896 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeNameUsingPHHelper: exit - PID:3796, exe name:mcshield.exe
4897 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:M WorkItem::GetExeName: The .exe name for for pid: 3796 is mcshield.exe
4898 16:36:58.841 [2872] (EvPlaceholderService) <3380> EV:L {PassThroughRecallLimiter::PassThroughRecallLimiter} (Exit) Status: [Success]
I have the mcshield.exe in the exculded exe registry on both the placeholder and passthrough entries as we use passthrough recall
07-09-2014 07:31 AM
yobole,
Thank you for the update on the registry update!
Did adding mcshield to the ExcludedExes leu stop the recalls from occurring?
07-13-2014 05:37 PM
So in this case,
The CHF 3 is still the latest for Enterprise Vault 10.0.4 R1 ?
07-13-2014 11:10 PM
Yes that's correct.
http://www.symantec.com/business/support/index?page=content&id=TECH200691