cancel
Showing results for 
Search instead for 
Did you mean: 

Exchange log space ran out... fixed, but now vault won't archive journal mailbox

Matthew_J
Level 4

So, over the weekend our Exchange log space ran out.  We are running 2013 with 3 DAG members.  After fixing the space and recovering exchange services, mail is flowing again.  However, Vault is not processing mail out of the journal as expected now.  I have lots of these in the verbose EV event log:

A queued operation exceeded the retry count and has been discarded

Exchange Journaling Task for STORK1

Retrieved a [MsgID_ArchiveMailboxEx3 (91)] MSMQ message on queue [JET\Private$\Enterprise Vault Exchange Journaling Task for STORK1 170124131644 J3], but it's retry count [4] exceeds the maximum retry limit [3].

 

I ran a DTRACE against the journal archiving task and this section stands out to me. 

{CArchivingAgent::ProcessUserEx:#18838} Processing mbx [/O=CHELAN COUNTY POWER/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=492C1005BB764F748799D56D22F21582-EXCH], mode [RN_ARCHIVING (0x1)], report mode [False].
{GetMailboxEntryDetails:#4847} Entry
CBaseDirectoryServiceWrapper::CreateDirectoryService() - Entry [m_nNumTries = 40]
{AgentMessageDispenser::ProcessNextMessage:#1521} Committing MSMQ transaction.
CBaseDirectoryServiceWrapper::CreateDirectoryService() - Successfully communicated with an EV Directory Service on the local machine
{VaultCoCreateInstanceEx} CLSID [{4EC6FF76-C97A-11D1-90E0-0000F879BE6A}] Server Name [(null)] Used Server Name [(null)] Num of attempts [1] Total elapsed [0.000s] Result [Success  (0)]
{GetMailboxEntryDetails:#4864} No matching MailboxEntry found
{GetMailboxEntryDetails} HRXEX fn trace : Error [0x80070490], [.\AgentsFunctions.cpp, lines {4845,4854,4858,4860,4865}, built Apr  3 22:28:46 2014].
{GetMailboxEntryDetails:#4922} Exception Occurred [0x80070490] [Element not found.]
{CArchivingAgent::ProcessUserEx:#18865} The Distinguished name for the mailbox is [CN=Exch13JournalMailbox,OU=SrvcAccts,OU=User Accounts,DC=domain1,DC=chelan].
{IsSystemMbx:#3620} The distinguished name of the mailbox (ADMbxDN) = [CN=Exch13JournalMailbox,OU=SrvcAccts,OU=User Accounts,DC=domain1,DC=chelan]
{CMailboxUsage::SetMailboxInUse:#196} Added [/O=CHELAN COUNTY POWER/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=492C1005BB764F748799D56D22F21582-EXCH] to list of mailboxes to be processed. List now contains [1] mailboxes.
{CArchivingAgent::Initialise} (Entry)
{MigratedDominoItems::Reset} (Entry)
{MigratedDominoItems::Reset} (Exit)
{CPrioritizedItemTable::InitialiseTable:#108} (Age) - Setting up the table.  Size: [1000]
{CPrioritizedItemTable::InitialiseTable:#108} (Quota) - Setting up the table.  Size: [1000]
{CArchivingAgent::Initialise} (Exit) Status: [Success]
:CArchivingAgent::ProcessUser() |Exiting routine |
{CArchivingAgent::ProcessUserEx:#20214} It took [0.010189] seconds to process mailbox [/O=CHELAN COUNTY POWER/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=492C1005BB764F748799D56D22F21582-EXCH]. [0x80070490]
{CMailboxUsage::RemoveUserFromList:#113} Removed [/O=CHELAN COUNTY POWER/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=492C1005BB764F748799D56D22F21582-EXCH] from list of mailboxes being processed. List now contains [0] mailboxes.
{AgentMessageDispenser::ActivateObject:#3708} An error that we don't specifically recognise has occurred, [0x80070490], so we'll just increment this messages retry count.
{AgentMessageDispenser::ActivateObject} (Exit) Status: [The dispenser was asked to repost the message to be retried later      (0xc00408e2)]
{AgentMessageDispenser::ProcessNextMessage:#1032} It took [0.010831] seconds to process the [MsgID_ArchiveMailboxEx3 (91)] MSMQ message body (ActivateObject). Processing the message [failed].
{AgentMessageDispenser::ProcessNextMessage:#1048} Processing MSMQ message body (ActivateObject) failed with [0xc00408e2]. The retry count will be incremented, and the message reposted to the end of the queue.
{AgentMessageDispenser::ProcessNextMessage:#1112} Reposting a [MsgID_ArchiveMailboxEx3 (91)] MSMQ message on queue [JET\Private$\Enterprise Vault Exchange Journaling Task for STORK1 170124131644 J3]. The new retry count is [1]

This is Vault 10.0.4 (I know, it's out of support and needs to be upgraded.... It's been on my list for a while but keeps getting priority bumped) and Exchange 2013.

 

Can anyone help?

 

6 REPLIES 6

plaudone1
Level 6
Employee

Hi Matthew_J,

It could be that there are some corrupted items in the mailbox.   https://www.veritas.com/support/en_US/article.000024128

Regards,

Patrick 

So, as an update here.... We decided that since signs were pointing to message or mailbox corruption of some sort with our journal mailbox, we would go ahead and decom that existing journal mailbox and create a new one.  Since the existing mailbox had grown to 10,000 items since this morning, we figured we'd be ok with just getting archiving running again and then we'd deal with the backlog later.  We did that, and exchange is journalling to the new journal mailbox.  We removed the old journal mailbox target and related tasks and added in new ones for the new mailbox.  Dtrace tells me it is pointing at the new mailbox, so that seems to be successful.

However, we are still not able to get items to archive out of even the new journal mailbox.  Dtrace'ing the journalarchive task has the same errors as above.  After restarting the admin service, all the other services start successfull according to event viewer, then we get four 2270 events in a row, and nothing gets archived. 

Where else can I look to figure out how to get the journal archiving running again?  Anyone have any other thoughts here?

Veritas support shut us down initially because we are still on 10.0.4 CHF3, and now we are exploring how to get extended support either through our reseller or directly with a credit card.  I'm hoping that this will give us the ammo we need to make getting upgraded to at least 11 a priority.

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello Matthew.

Is this a 'archive journal mailbox' only server, or do you also do mailbox archiving? I assume only journal archiving.

Can you sync the new Journal Mailbox without issues? How many items are in that new Journal Mailbox? Do you have items going from normal to 'pending archiving'? What is the size of your MSMQ? If necessary, can you enlarge that in Servermanager?

What is your Journal Archiving policy? If you have pending items, do you have 'clear pending when running in report mode' in advanced settings? If you do, set the task in report mode, restart it. Verify in the Journal Mailbox items start changing back to IPM.Note. (might take a while btw.. When done, verify MSMQ. if it is empty, then no issues. Change task to run in normal mode again, restart it. If there are items in there, verify you have no issues on SQL, Storage and index locations. (diskspace?). Check the items (or sample if many) in MSMQ, and check the date. Are these all recent? If not, it might be worth purging the queus. There is documentation on which queus can be purged safely. In theorie, a new task should created new MSMQ's. If you see 'old' queues, delete those. (you can find which MSMQ belongs to which task *edit* in SQL somewhere).

2270 is pretty tough to troubleshoot. This link http://www.veritas.com/docs/000020478 is a start...

Good luck.

GJ

Regards. Gertjan

Hi Matthew_J,

  I'd agree with Gertjan about potentially purging those J queues, depending on what you really see in the mailbox.  You might have non-fatal errors on the items in there causing them to be retried indefinitely. You mentioned it's not archiving, but what are the message classes you see in the mailbox? Is the task at least making these items pendingarchive, or pendingarchivepart?  If so, then it indicates that those items in the msmq's are either older/non-archiveable items, or that they are newer and still exist in the mailbox; or a combination of both. 

   If that's the case, you could purge those from the J queues, and use either DocMessageClass (a free 3rd party executable), or even PendingShortcutTimeout in the Mailbox policy with a report mode run to convert the pending items back to IPM.Note so they could be re-attempted by the Journal task. 

    If they still cause the same issues when being archived again, then Mr. Laudone's point about potential corrupted items still in the mailbox would need to be addressed, but there would be another whole set of steps with separate actions to identify those.  It's probably more efficient to focus on the easier route first. 

Thanks,
Daveoflave

Thanks to everyone here for the insightful and helpful replies. I wanted to follow up to say that we got it working, and the fix was simple... one of those times when you lose sight of the simple things because you have the worst case scenario in your head.

The fix was to run the provisioning task, and in fact since we have a scheduled provisioning task at 5:00 pm daily, the problem fixed itself last night.

The root of the issue was that during our Exchange log storage problems, the Journal mailbox DB failed over to another DAG member, and that DAG member was not configured properly in Vault as a target (we have three DAG members). It appears that even after fixing exchange and having the journal mailbox DB active this other DAG member, Vault was still archiving fine out of the journal mailbox.... until, the provisioning task ran at 5:00 PM that day, and it appears then after that it was aware that this mailbox DB was on a different DAG member, and attempted to contact it to archiving... and couldn't, because we did not have it properly configured as a target.

We failed the journal mailbox DB back to the original DAG member, since we were thinking that was a problem... but we failed to manually run the provisioning task after that so it never picked up the change.

My questions after the fact are... what is the best method for adding ALL DAG members to vault.... obviously we want them all configured, should each DAG member have a journal archiving task associated as well, or should we only have one journal archiving task?

Anyway, always good to have a reminder to look at the simple things first. I am on the warpath now to get this upgraded to 11, then 12.... it was certainly a sinking feeling to have the journal mailbox ticking up rapidly and find out we were so far out of a supported version.

Thanks again for the help and suggestions. This community is a great resource!

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello Matthew,

You need to add each Exchange Server which can host the Journal Mailbox database to the Target section in EV. I assume that the Enterprise Vault System Mailbox is in the same database as the Journal Mailbox(es).

Journal Archiving is (opposed to Mailbox Archiving) following the Journal Mailbox. It does not matter where the Journal Mailbox lives, the task will continue to archive from it (provided the configuration of EV is correct obviously :) ). You only need to have 1 Journal Archiving Task, which archives from the Journal Mailbox. If the Journal Mailbox DB moves to another server, the task will keep processing it. There might be a small delay, due to the mapi connections switching, but that is neglectable. In my environment, I have databases moving around regularly (intentional or accidental), and there is no issue with items building up in the Journal Mailboxes.

I'll check where it is descibed, but it might be in the installing/configuring guide.

Regards. Gertjan