cancel
Showing results for 
Search instead for 
Did you mean: 

Storage Exec 5.3 Windows 2003 Cluster servers

Aurelio_Sorgiov
Level 2
Hi,

I am having an issue with one of my Cluster server nodes after about three days of running one of the machines seems to run out of system resources or some type of process is causing it to fail. Files over the 1GB in size are no longer accessible (error message access denied )and the Terminal Services longer seems to function correctly, you can not remote access the machine.

Each cluster node is speced with 4 x Xeon 3.2GHZ CPU's, 8GB RAM, 8GB Page File,dual HBA 2GB fibre cards and has 3 x SAN attached drives of around 950GB in size.

Both nodes control around the same amount of data on each of the drives and both have files which are over the 1GB in size.

Does anyone have any ideas as to what maybe the issue or where I should be looking.

Thanks

Aurelio
12 REPLIES 12

Mohammad_Ansari
Level 4
Hi Aurelio:
- check in the task manager and see which process is using the most memory and % of CPU on the affected node.

- Check to see if you're running reports as part of an alarm actions of a policy on multiple objects. Alarm might be tripping at the same time on multiple objects and because of that multiple reports are being generated.

- you can also check event log to see if you've any errors with regards to Storage Exec services like Quota Advisor or File Screen Server.

- Also does this occur on a particular node or on both nodes but depends on the node that is active ?

_Shazam

Aurelio_Sorgiov
Level 2
Hi Mohammad,

The issue is happening only on one of the nodes, the other node has no problems. I have move the shared resources from one node to the other and the issue follows the resource shares.

Aurelio

Mohammad_Ansari
Level 4
Hi Aurelio:

Did you had a chance to look at the task manager when you change the nodes and see what process use the most CPU% and memory.. Also did you check the eventlogs for any msgs.?

Another question... what kind of managed resources have you got on the resource shares ? Quota/File blocking... also have you checked what happens if you delete the managed resources ?

_Shazam

David_Clark
Level 3
Hello Aurelio

Did you find the cause of this problem?

We are seeing similar problems with w2k3 cluster resources failing after installing Storage Exec 5.3.

Many thanks

Dave

Jaap_van_Kleef
Level 3
You might want to increase the following value :

HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\IRPStacksize. If IRPStacksize does not exist create it.

Set the value to 25 (dec). Unfortunately a reboot is required to make this entry effective.

Reason is that SE introduces additional filer driver, who need the extra room. Even though the installer automatically increases this value from it's default 15 to 18, this is not always enough. Max value is 50.

This should resolve your issue.

Regards,

Jaap

David_Clark
Level 3
Hi Jaap

Have you actually seen this problem with 5.3, and has setting IRPStack fixed it?

Many thanks

Dave

David_Clark
Level 3
Hello Aurelio

Did you ever find the cause of this problem - I am seeing the same thing and am deperate to fix it!

Many thanks

Dave

Jose_Henrique_M
Level 2
I have the same problem and have identified the source as a handle leak for SEFSSvr.exe - FileScreen Server.

It just keeps eating at the handles unless you stop the service manually. Terminal services / remote desktop stops working and after a few days the server stops responding. Handle count is over 77000 !!

I'm using Storage Exec 5.3.2190 with hotfix 23 in a two node cluster. Clean install and I applied hotfix 23 before configuring the Windows Server 2003 cluster with SP1.

If someone has found a solution to this problem please email me.

Mohammad_Ansari
Level 4
Try not to do content checking on the folders where you've file blocking policies and see that if it helps....

_Shahzad

David_Clark
Level 3
Hi Jose

Can you let me know how you identified SEFSSvr as the problem? Have you raised a call with Veritas about this call - if so what did they say?

Do you have the problem on all nodes of the cluster or just one?

We have seen the problem on one node and have had to remove Storage Exec all together. However on the other node it is ruynning without problems....

BTW we do not do content checking at all.

Hopefully there will be a fix soon!!

Thanks

Jose_Henrique_M
Level 2
Hi,

I thought it might be a memory leak so I used Performance Monitor to check for handle count and pool nonpaged bytes for all the porcesses on my server. After a couple of days the only one that kept growing was SEFSSvr.exe.

The problem happens only on the active node. My file server is a two node cluster in an active/passive configuration. On the active node the handles grow out of control, if I move to the other node then the handles freeze (they do not drop) and on the new node they start growing.

I did a very basic install of Storage Exec and have not changed any Content Checking options.

I don't have paid support so I opened a free e-mail support case and got a reply the next day (see below):

Hi Jose,

Try a repair install of Storage exec.

Thanks,
Electronic Support Team

After following their suggestion the problem has not changed. The handles still increase (22610 at the moment). I replied to the support team but haven't hear back from them.

If anyone can help my Case ID is 500-254-348

Thanks,
Jose Henrique

David_Clark
Level 3
We found this problem to be caused by some other softwre installed on our server. Once we installed a patch to fix a memory leak Storage Exec installed and ran without problems