Forum Discussion

LostInDocumenta's avatar
11 years ago

Journal Archiving freezes, DTrace Error 0xc0040b31

EV: 10.0.4.1189

Exchange 2010: 14.03.0158.001

Windows Server 2008 R2 Enterprise SP1

I am getting alerted to the issue based on the mailbox database size set for the journal . There are no errors reported regarding the journal task itself, but it is obvious that items are not completely archiving. Viewing the journal mailbox through outlook shows about 92,000 items Pending Archive Part, and 6800 Pending Archive. Unarchived items continue to grow at a steady rate. Restarting the journaling task will start moving some items for a few minutes, until archiving seems to just stop. (after restarting the task, items with the message class will go down, pending archive will rise and fall, and pending archive part will rise and fall, then after a few minutes, the pending archive and pending archive part stop moving and message items increase. This looks like it is finally going in the right direction after restarting the storage service. This has happened to me about 3 times in the last month, where I have had to restart the storage service to get the journaling task to run normally.

There were no obvious events in the event log relating to this issue on the EV server.

DTrace of the Journal Task shows the following;

<JournalTask> <6656> EV:H {CArchivingAgent::QueueEligibleItem} HREX fn trace : Error [0xc0040b31], [.\ArchivingAgent.cpp, lines {25004,25034,25054,25062},built Jul 10 17:51:28 2013].
<JournalTask> <10364> EV:H {CArchivingAgent::QueueEligibleItem} HRXEX fn trace : Error [0xc0040b31], [.\JournalPart.cpp, lines 413,477,488,509,521,1020}, built Jul 10 17:5128 2013].

and 

<JournalTask> <18816> EV:M {CSorageArchive::GetVaultStatusEx:#764} Modifying throttle status for archive [####################VAULT] from [DV_DS_AS_AVAILABLE (1)] to [DV_DS_AS_STORAGE_QUEUES_FULL (6)].

 

For the first errors I found this article http://www.symantec.com/connect/forums/archivingjournal-task-will-pause-let-it-recover-0xc0040b31, but the links to support materials are no longer valid.

For the second message, I found an article http://www.symantec.com/business/support//index?page=content&pmv=print&impressions=&viewlocale=&id=TECH173570 

 

Has anyone else experienced similar situation? The best advice I got was to bounce the services every once in a while.

 

 

 

  • have you tried this yet? http://www.symantec.com/business/support/index?page=content&id=HOWTO58188

9 Replies

  • have you tried this yet? http://www.symantec.com/business/support/index?page=content&id=HOWTO58188

  • 1.  What's the size of your MSMQ storage?

    2.  Have you set to remove email profiles after 1 day?

     

  • I looked at this article but didn't implement it yet. I try not to create registry keys if I don't absolutely have to. Was hoping to get a feeler to see if this is somthing other people are seeing in their environment. It just gets a little annoying having to restart the storage service, especially since it also affects the actual users.

  • you have to keep in mind that each environment is different and that EV as an application has dozens of moving parts and also taps into dozens of moving parts in your environment in order to operate. if there's something i've learned over the years, it's that if you're having an issue and it's described in a technote, it's worth trying the reg key. there's no harm to it (in this case) and it's very easy to back out if it doesnt help.
     

  • I have seen in some cases that the counter method did need to change in order to get the proper information back from MSMQ.  This allowed the process to perform normally.  

    If it is only a storage issue then that setting is worth a test.  If the mailbox is backing up due to mass mailers using the same id then you would want to take a look at the following:  http://www.symantec.com/docs/TECH181078

     

  • Looks to me to be a problem with the storage archive queue hitting a limit.,any messages stuck in the storage archive queue before you restart it? And/or A4? It items are stuck in there the common culprit is either writing to the journal vault store partition( i.e. your san/nas/storage) Or your SQL server i.e. your journal vault store DB. More than likely SQL..any long running SPIDs or blocks or worse deadlocks? Get your SQL DBA to monitor the activity when you hit the problem. What is the size of your vault store db? The journal archive table and watchfile table -num of rows? Your research pointed to a solution being SQL maintenance things like update statistics but specifically index fragmentation. Do you have scheduled SQL maintenance in place for your EV DB's? Has it been running successly? If not there is no harm running a check for index fragmentation and running a one of maint ? More info: http://www.symantec.com/business/support/index?page=content&id=TECH168905
  • Maintenance is run every other day and has been completing successfully. The database is rather large, we also run DA which keeps a good amount of items for an extended period.

    This is for the journal so it tends to be the j3 queue that would get stuck, and there tends to be one or two items in there when the task seems to stall.

    Will definatly run the check just to be sure.

  • The email gateway is pretty good at picking up and blocking mass emailers, though some do get through from time to time. I didn't see anything excessive (a few instances of 9 or 10 of the same email, but it looks pretty normal for a distribution group email.

  • Thanks, will mark your original post as answer, seems to be the best fit for the information presented. You are absolutly right about lots of moving parts, new day, new error. i will play with this setting over the next few days. There may be some other underlying issue here that I have to do some more digging to get the root of.

    Today, something is inexplicably putting the stores into backupmode. Index and store volumes are all well above 20% free and set to roll over at 5%. ....

    I will keep working on my Dtraces and see what fun errors I find.

     

    Thanks to all for your help with this, it is truely appreciated.