cancel
Showing results for 
Search instead for 
Did you mean: 

MSMQ won't empty and archive task will restart

Poly
Level 3
Hi,

I have a major problem that started 6 days ago and no solution in the background.

Starting from Monday, the MSMQ folder got to 1 Giga size and got stuck (due to the 1 Giga restriction).
We had to stop the whole vault system since we are in a cluster mode and delete all the *.mq files.

Since then the problem continues.

Everyday, the synchronization task begins.
The message queue grows to 2750 (as the numbers of mailboxes) and than start going down.
It gets to 10 msg or less in the message queue and hangs for a few seconds.
Then it start jumping to huge numbers and won't stop growing until the storage folders reaches 1 Giga (it takes about 2 hours) - when it goes to 1 Giga, the system fails to work and I get my cluster servers jumping around from one node to the other node.
While the MSMQ gets stuck, I get multiple Archive Task (Event ID 3197), as if the synchronization task starts all over again and again and again.

We can't find out the problem.
It's been a week now and Symantec can't help me and as for now claims it's the first time they see this problem.
It's been a week when I have to take the system down each and every day.

Maybe you've heard of such a case?

Poly.
12 REPLIES 12

Michael_Bilsbor
Level 6
Accredited
Hi,
 
What's your case number?
 
Mike (Symantec)

Poly
Level 3
My case number is 240-744-828

Allan
Level 3
Partner
Is there a solution to this issue? I am going onsite tomorrow morning to our biggest client and they sound like they have a similar issue.

Allan
Level 3
Partner
I will post my updates or findings here as well.

jimbo2
Level 6
Partner
Verify that the MSMQ is excluded from virus scan. Verify that it is installed without AD integration.

Michael_Bilsbor
Level 6
Accredited
Hi Poly,
 
You should be receiving a diagnostic module tomorrow.  My team have developed it and sent it onto the support to pass onto yourselves.
 
 

Allan
Level 3
Partner
Hi there, any chance of me getting a copy of that tool as well? Our client has 40000 mailboxes and journals in excess of 600000 messages per day...

Michael_Bilsbor
Level 6
Accredited
Hi,
 
Remember this is just a diagnostic not a fix.  So once we have applied it to Poly's system hopefully we'll then know more about what the issue is and go from there.
I don't want/ there is no need at present to send diagnostic to multiple customers.  I don't even know if this is the same issue.  What's your case reference number?
 
 

Poly
Level 3
Hi,

Can you please hurry them up?

They told me about the debug module 3 days ago and still nothing.
One call yesterday to tell me to keep waiting.

Poly

Allan
Level 3
Partner
We're running Enterprise Vault 2007 SP1 on Windows Server 2003 SP1 and we've discovered that since our server is under heavy load with huge amounts of emails to journal it appears that the EVConvertorSandbox.exe tasks aren't releasing their CPU time when the default 10mins timeout occurs.
 
What this means is that the process gets stuck on one email that is either corrupt or just plain **bleep** large (attachment) and this causes the process to keep running on the single email thus in my case often 3 out of the 5 processes are bogged down on single emails leaving only 2 to do the work - which isn't enough thus the massive backlog of emails in the MSMQ J2 and Journaling mailbox.
 
The fix is to set the timeout value for the registry keys: ConversionTimeout and ConversionTimeoutArchiveFiles to something more reasonable like 5mins. Problem is that on version 2007 SP1 these keys now reside under HKLM\Software\KVS\EnterpriseVault and are of type REG_ZG where as they used be located under HKLM\Software\KVS\EnterpriseVault\Storage and were always of type DWORD.
 
I've tried creating them as is in the HKLM\Software\KVS\EnterpriseVault\Storage location but that doesn't seem to work. I've now requested that this be escalated to Engineering. If I get a solution I will post it here.

Poly
Level 3
EV version 7.5 + SP3.

An update to whom it may concern.

It appears to be a problem between the EV to the DST patch that is installed on the servers.
Israel installed a different patch about a year ago that was released by Microsoft due to changes in the DST times.

After a lot of researching and many Dtraces, it appears that the EV cannot do some table calculations when processing the ArchiveTask while being on Jerusalem +2 DST.

As for now we work on Helsinki DST which is a +2 as well.
This is the problem we had and the solution we used from Symantec suggestion.

I must say that I'm deeply unsatisfied, since in earlier versions on EV the problem do not exist which probably means that has been some changes to the EV code in 7.5 version.

As for now, Symantec will not fix this and are waiting for Microsoft to fix this.

Poly.

Mojorsn
Level 5

Does anyone have an update to this issue?  We are experiencing the same issue as Allan and was wondering if Symantec addressed this.  Thanks.

 

Tom