Thanks for your reply. We

xcraft · ‎05-09-2014

Dear All,

The below is our environment information:

EV Version: 10.0.4 with hotfix to support Exchange 2013 SP1

Exchange Server Version: 2013 CU2

We have migrated our exchange server from 2010 to 2013 successfully. However, the archiving tasks is suspend when performing the email archive.

I noticed that it will say Not enough storage but we have enough storage for this operations. Also, we have perform the DB maintenace plan already. We have no clue whats going wrong.

Windows Event Logs

Log Name: Symantec Enterprise Vault
Source: Enterprise Vault
Date: 5/9/2014 12:08:44 AM
Event ID: 2216
Task Category: Archive Task
Level: Information
Keywords: Classic
User: N/A
Computer: EVSERVER01
Description:
Message dispenser will suspend processing for 10 minutes due to a recoverable error

Task: Exchange Mailbox Archiving Task for EXCHANGE02

The below is the dtrace logs

22525   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:M   {CArchivingAgent::PostProcessArchivedItemEx:#8355} Error to be retried [0x8007000e], Retry 0 of 3
22526   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:L   {CArchivingAgent::IncrementOpsLeadingToRecycle:#23900} Current count [11], recycle at [50000]
22527   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:M   :CArchivingAgent::PostProcessArchivedItem() |Exiting routine |
22528   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:H   {CArchivingAgent::PostProcessArchivedItemEx:#8458} It took [0.669503] secs to post-process Saveset [201405089561548~201404111632430000~Z~A01734A53BB65E8F5B3B121AB23B90C1]
22529   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:H   {CArchivingAgent::PostProcessArchivedItemEx} (Exit) Status: [Not enough storage is available to complete this operation. (0x8007000e)]
22530   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:L   {AgentMessageDispenser::ActivateObject} (Exit) Status: [A recoverable error was detected, the dispenser has been asked to sleep (0xc0040cd4)]
22531   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:M   {AgentMessageDispenser::ProcessNextMessage:#1032} It took [0.676451] seconds to process the [MsgID_RemoveBody (0)] MSMQ message body (ActivateObject). Processing the message [failed].
22532   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:M   {AgentMessageDispenser::ProcessNextMessage:#1238} AGENTS_E_DISPSLEEPNOREPOST. Agent will sleep without reposting the current message.
22533   23:06:06.393   [4464]   (ArchiveTask)   <4128>   EV:L   {AgentMessageDispenser::ProcessNextMessage:#1521} Committing MSMQ transaction.
22534   23:06:06.424   [4464]   (ArchiveTask)   <4128>   EV:M   AgentMessageDispenser: 1 of 5 Agent threads is pausing
22535   23:06:06.424   [4464]   (ArchiveTask)   <4128>   EV:M   AgentMessageDispenser::LogPauseReason - First Dispenser Thread to Sleep
22536   23:06:06.424   [4464]   (ArchiveTask)   <4128>   EV~I   Event ID: 2216 Message dispenser will suspend processing for 10 minutes due to a recoverable error |Task: Exchange Mailbox Archiving Task for EXCHANGE02 |
22539   23:06:12.212   [4464]   (ArchiveTask)   <2960>   EV:L

Thanks,

EV Users

Merv · ‎05-09-2014

This is normally die to mapi profile exhaustion...you can confirm this by checking the number of outlook profiles under the VSA account. Stop the task controller service and delete all profiles. Create a test one for VSA and system mbx. If good start the task controller svc and errors should cease.

If they come back then we have a perf issue and you can tune the EV server down a little and automate the profile clean up with profile expire reg key and reducing connections to exchange

http://www.symantec.com/business/support/index?page=content&id=TECH129173

how many exchange servers do you have setup in the EV server? How often do you archive? How many concurrent threads per task? Do you do scheduled pst migrations? Public folder task? Lots of restores? Manual archiving from users?Current state of msmq? Any backlog there?

xcraft · ‎05-10-2014

Hello Merv,

Thanks for your reply. Actually, we did try to delete all of the outlook profiles already. Also, we stop all scheduled tasks and run the manual archive for one mailbox only to test and the error will be showing in the event logs after all.

The concurrent threads pre task is using the default value (5 and 1000), we did try to modfy the value to 3 and 500 but it is the same.

When we perform the manual archive, I saw it can archive the emails very fasy but after few minutes, that event 2216 will be appeared and the archive tasks will be suspended for 10 minutes to slow down the whole archiving process.

We have 4 exchange 2013 servers in our environments.

Please see if you have any idea we can troubleshoot the issues.

Thanks,

EV Users

Merv · ‎05-10-2014

Ok so we know the problem..profiles hit the limit after a few minutes.

Some logs and info required

- are you using outlook 2007 sp3 with july2012 updates KB2596598?

-get me a count of the profile number in the regkey

HKCurrentUsers\Software\Microsoft\Windows NT\Current Version\Windows Messaging Subsystem\Profiles\{user id}

how many are there in the VSA profile? Or check outlook profiles and do some screenshots

-start a dtrace of 1. archivetask 2. agentclientbroker 3. storagearchive and 4. EVExchangeWebServicesProxy

http://www.symantec.com/business/support/index?page=content&id=TECH38122

- clear all profiles, stop task controller before that, start dtrace and logging verbose mode for those 3 processes then wait till you get the 2216 errors then stop the trace

- get me a screenshot of all your msmq queues before the errors and after the errors

- you mentioned sql..any large tables? Watchfile and journalarchive ? Index fragmentation issues or performance issues? ( this should be related but could affect if there is generally slow archiving and you see items stuck in storage archive queue for too long)

- any other related event Id errors or warnings?

some common fixes to try:

Improve GC look ups by hard coding CAS array fqdn (e.g. Mail.domain.com) on the DS Server regkey

http://www.symantec.com/business/support/index?page=content&id=TECH139751

In case you get any event Id 3460 but would not hurt to configure this for exchange 2013 or remove this if it causes an issue

http://www.symantec.com/business/support/index?page=content&id=TECH213773

load balancer issues

http://www.symantec.com/business/support/index?page=content&id=TECH169970

Further tune it down - lower your threads down to 2 per task and monitor

Merv · ‎05-10-2014

Ok so we know the problem..profiles hit the limit after a few minutes. I.e. You don't get the error after restarting task controller.

Some logs and further info required

- Are you using outlook 2007 sp3 with july2012 updates KB2596598?

-Get me a count of the total number of outlook profiles? I.e from the registry:

HKCurrentUsers\Software\Microsoft\Windows NT\Current Version\Windows Messaging Subsystem\Profiles\{user id}

Or check outlook profiles and do some screenshots

-Start a dtrace of 1. archivetask 2. agentclientbroker 3. storagearchive and 4. EVExchangeWebServicesProxy

http://www.symantec.com/business/support/index?page=content&id=TECH38122

- clear all profiles, stop task controller before that, start dtrace and logging verbose mode for those 3 processes then wait till you get the 2216 errors then stop the trace

- Get me a screenshot of all your msmq queues before the errors and after the errors

- you mentioned sql..any large tables? Watchfile and journalarchive ? Index fragmentation issues or performance issues? ( this should be unrelated but could affect if there is generally slow archiving and you see items stuck in storage archive queue for too long)

- any other related event Id errors or warnings? Before you get the 2216 errors

Merv · ‎05-10-2014

Some common fixes to try:

Improve GC look ups by hard coding CAS array fqdn (e.g. Mail.domain. com) on the DS Server regkey

http://www.symantec.com/business/support/index?page=content&id=TECH139751

In case you get any event Id 3460 but would not hurt to configure this for exchange 2013 or remove this if it causes an issue

http://www.symantec.com/business/support/index?page=content&id=TECH213773

breaking my reply as my comments are being blocked automatically "Your comment has been queued for moderation by site administrators and will be published after approval. "

Merv · ‎05-10-2014

Last point

Load balancer issues

http://www.symantec.com/business/support/index?page=content&id=TECH169970

Further tune it down - lower your threads down to 2 per task and monitor if this helps and do daily restarts of task controller service

xcraft · ‎05-11-2014

Thanks for your reply. We have applied the latest security update to the outlook 2007 already which will support exchange 2013. (SP3 + latest update which covers the KB2596598)

for the profiles, there have around 70 profiles are there and there have many items didnt get clear in the message queue because of the tasks will be suspend 10 minutes always.

for the SQL, we did run the SQL maintenace according to the symantec KB (shrink+reindex the DB before run the archive) already but it didnt resolve the problem. Any things we can perform in SQL level?

We are using Exchange 2013 and there should have no CAS array anymore. For the GC, all of our domain controllers are in the same subnet and same location, it should not need to set it in the registry, right?

For the message queue, before we archive the items, A1 queue has 135, after tue the threads from 5 to 2, A1 queue has 99

In the meantime, we did by pass the load balancer from EV server already (use host file to point the outlook anywhere address to exchange server directly.

I did see some errors in the event logs in sometimes (not always happen)

EVENTID 3230

Could not create a new MAPI session on Exchange server 'EXCHANGE02'.
The most likely cause is a connection issue that is causing excessive delays in the task. This can often be recovered by restarting the MAPI related tasks.

Internal References:
Unable to get exclusive access to the MAPI thread pool.
Mutex name: Lie mode lock
Reason: Last lock holder: Process <12196> <RetrievalTask> Thread <6288>

EVENTID 7206

A locking error has occurred in CEVMAPILieModeLock : 0x102
Internal reference: STWL Thread Mutex CMAPISession::CreateMapiSession

After i cleaned up all outlook profiles + tune down the thread from 5 to 2 with items from 1000 to 400, event ID 2216 is still happening...

14099   22:27:19.849   [10860]   (ArchiveTask)   <9464>   EV:L   {DelAttachFromMsgAndAppendList:#2712} Reading PR_ATTACH_DATA_BIN. Checking for dummy attachments.
14100   22:27:19.849   [10860]   (ArchiveTask)   <9464>   EV:L   {DelAttachFromMsgAndAppendList:#2792} Item isn't a document, so we want to build up an attachment list.
14101   22:27:19.849   [10860]   (ArchiveTask)   <9464>   EV:L   {DelAttachFromMsgAndAppendList:#2861} Building both a HTML and RTF copy of the attachment list
14102   22:27:19.849   [10860]   (ArchiveTask)   <9464>   EV:L   {DelAttachFromMsgAndAppendList:#2952} Constructing links for non-calender item attachments
14103   22:27:20.083   [10860]   (ArchiveTask)   <9464>   EV:L   {RemoveITPureRTFImg:#1606} RTF image not formatted as expected. It does not have |in control, so will do no further processing.
14104   22:27:20.114   [10860]   (ArchiveTask)   <9464>   EV:L   {DelAttachFromMsgAndAppendList} (Exit) Status: [Success]
14105   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:M   {CArchivingAgent::PostProcessArchivedItemEx:#8355} Error to be retried [0x8007000e], Retry 0 of 3
14106   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:L   {CArchivingAgent::IncrementOpsLeadingToRecycle:#23900} Current count [36], recycle at [50000]
14107   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:M   :CArchivingAgent::PostProcessArchivedItem() |Exiting routine |
14108   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:H   {CArchivingAgent::PostProcessArchivedItemEx:#8458} It took [9.641376] secs to post-process Saveset [201405119816084~201405080206020000~Z~904849CD39BABFEC0BED2C2B052F6C11]
14109   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:H   {CArchivingAgent::PostProcessArchivedItemEx} (Exit) Status: [Not enough storage is available to complete this operation. (0x8007000e)]
14110   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:L   {AgentMessageDispenser::ActivateObject} (Exit) Status: [A recoverable error was detected, the dispenser has been asked to sleep (0xc0040cd4)]
14111   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:M   {AgentMessageDispenser::ProcessNextMessage:#1032} It took [9.641730] seconds to process the [MsgID_RemoveBody (0)] MSMQ message body (ActivateObject). Processing the message [failed].
14112   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:M   {AgentMessageDispenser::ProcessNextMessage:#1238} AGENTS_E_DISPSLEEPNOREPOST. Agent will sleep without reposting the current message.
14113   22:27:20.146   [10860]   (ArchiveTask)   <9464>   EV:L   {AgentMessageDispenser::ProcessNextMessage:#1521} Committing MSMQ transaction.
14114   22:27:20.177   [10860]   (ArchiveTask)   <9464>   EV:M   AgentMessageDispenser: 1 of 2 Agent threads is pausing
14115   22:27:20.177   [10860]   (ArchiveTask)   <9464>   EV:M   AgentMessageDispenser::LogPauseReason - First Dispenser Thread to Sleep
14116   22:27:20.177   [10860]   (ArchiveTask)   <9464>   EV~I   Event ID: 2216 Message dispenser will suspend processing for 10 minutes due to a recoverable error |Task: Exchange Mailbox Archiving Task for EXCHANGE02 |

Thanks,

EV User

Merv · ‎05-11-2014

Forgive the duplicate post... The lie lock errors are interesting..could they be the root cause of your MAPI exhaustion issues? Or 2 seperate issues?Lie lock mutex issues: http://www.symantec.com/docs/TECH216602 Check if you have the same set of event id's as the above by filtering through your event viewer and dtrace logs Try to stop all your task and only start one and see if you get the errors .. I would get a case opened with support to confirm if you face the lie lock issues and the upgrade to EV 11(Just released last mon)

xcraft · ‎05-15-2014

Thanks for your help. I did test with ony tasks with manual archive few items but it didnt show those errors anymore...however, it will have event id 2216 which is still showing in the event logs...it means that the whole archiving process will be slowed down by this event id....

VOX

Message dispenser will suspend processing for 10 minutes due to a recoverable error