cancel
Showing results for 
Search instead for 
Did you mean: 

ev9sp2 - journal archiving not processing

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello all,

I am out of ideas, and possibly someone of you has an idea.

Journal Arciving, target is EMC Centera. 5 journal archviving servers. 3 work, 2 don't. The ones that do not, in the journalmailboxes handled by these have 10000's items in PendingArchive state. I do not see them going to pendingarchiving.part AFAIK nothing has changed on the network.

msmq seems to be working. There are some, but not many, large messages.

We've restarted services/rebooted. on one of the journal mailboxes, i used docmessageclass to return the messageclass to im.note. then the are archived.

any ideas?

thx,

Gertjan

Regards. Gertjan
1 ACCEPTED SOLUTION

Accepted Solutions

JesusWept3
Level 6
Partner Accredited Certified

Does it keep occuring?
Also with the centera, is that using Collections at all?
And after you use docmessageclass to revert them, did you restart any tasks or services?
The vault store that they're writing to, is it set to After Backup or Immediately after archive?
Does the centera have any replicas?
Does each EV server write to the same archive and same vault store?
Or do you have individual vault stores for each server which have Centera partitions for each?
If its different vault stores, different partitions, do they all use the same Centera node ip address?
What about the PEA files, are they all the same?

If it happens again, find the oldest item in a pending state, have a look and see in Browser Search as to whether the item has actually been archived and indexed and its just awaiting to be post processed.

Also i'm guessing theres nothing in the event logs giving any clues?
And the MSMQ Quota hasn't been breached at any point?

My best guess is that theres something in the background going on where its sleeping for one minute, wakes up, an issue occurs, goes back to sleep for a minute etc....you then restart the task and suddenly it starts working

https://www.linkedin.com/in/alex-allen-turl-07370146

View solution in original post

5 REPLIES 5

MMcCr
Level 4
Partner Accredited Certified
Stab in the dark, I've seen something similar and it turned out to be an excel spreadsheet in an Email that EV couldnt archive (or more accuratly slowed EV down to a crawl so it ended up timing out!) - there where lots of warnings in the event logs for a particular mail subject name and when we found that mail and removed it (all copies of it as it was all over the place and sent to a large audience) we eventually got EV archiving again. It seems that EV couldn't handle the amount of macros in the Excel spreadsheet in our case. if you're also retyring failed items, maybe turn off the failed item retry too (we had to do this to be able to manage the killer message we got.

JesusWept3
Level 6
Partner Accredited Certified

Does it keep occuring?
Also with the centera, is that using Collections at all?
And after you use docmessageclass to revert them, did you restart any tasks or services?
The vault store that they're writing to, is it set to After Backup or Immediately after archive?
Does the centera have any replicas?
Does each EV server write to the same archive and same vault store?
Or do you have individual vault stores for each server which have Centera partitions for each?
If its different vault stores, different partitions, do they all use the same Centera node ip address?
What about the PEA files, are they all the same?

If it happens again, find the oldest item in a pending state, have a look and see in Browser Search as to whether the item has actually been archived and indexed and its just awaiting to be post processed.

Also i'm guessing theres nothing in the event logs giving any clues?
And the MSMQ Quota hasn't been breached at any point?

My best guess is that theres something in the background going on where its sleeping for one minute, wakes up, an issue occurs, goes back to sleep for a minute etc....you then restart the task and suddenly it starts working

https://www.linkedin.com/in/alex-allen-turl-07370146

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

@mmcr, I suspect something like this, but am until now not able to find an item that might be causing this.

@ JW,

Does it keep occuring? - not anymore. Some actions were done on the Centera, which seem to have sort of fixed it. I've changed the journalmailboxes on the stores, and those 'new' journalmailboxes are being archived.

Also with the centera, is that using Collections at all? Yes it is.
And after you use docmessageclass to revert them, did you restart any tasks or services? yes.
The vault store that they're writing to, is it set to After Backup or Immediately after archive? Immediately after archving
Does the centera have any replicas? yes.
Does each EV server write to the same archive and same vault store? Each server writes to the same store, but to different archives. We have 1 archive for each Exchange server (ie server1 7 stores, 3 tasks, write to archive1 in vs1 etc)
Or do you have individual vault stores for each server which have Centera partitions for each?each server has it's own store, in an dedicated centera partition.
If its different vault stores, different partitions, do they all use the same Centera node ip address? same node address.
What about the PEA files, are they all the same? 'they'... we use 1 PEA file, which is stored on a share, which is accessible by all ev-servers

If it happens again, find the oldest item in a pending state, have a look and see in Browser Search as to whether the item has actually been archived and indexed and its just awaiting to be post processed.

will try

Also i'm guessing theres nothing in the event logs giving any clues? nop. Although there is an issue with the centera (we enabled storage expiry, and it is expiring many items (millions), which causes storage issues.
And the MSMQ Quota hasn't been breached at any point? not as far as I can determine. Quoate is 20GB

My best guess is that theres something in the background going on where its sleeping for one minute, wakes up, an issue occurs, goes back to sleep for a minute etc....you then restart the task and suddenly it starts working - that is my guess too, I am trying to figure out wtf.

Thanks for your ideas. They do give some direction.

 

Regards. Gertjan

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Hm....

I am investigating items that are being moved to the 'failed to copy' folder in the Journal Mailbox.

all messages in here have an the mail to be journaled another mail that is empty. it is either an attachment (msg file) or a msg file that seems to be dragged into the original mail.

No sender, recipient, subject or body. When opening it states : This message has not been sent.

Strange. Checking furhter

Regards. Gertjan

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

 

Fixed..

This was a combination of a mail-flood, EMC connectivity errors, and corrupted mail.

Initially we noticed that 3 Journal Mailboxes received about 100.000 mails in 1 hour, resulting from a faulty workflow in Sharepoint. Some of these messages were corrupt. The corrupt messages had MSG files attached or embedded in the original message that were empty (no sender/recipient/subject/body). Deleting that specific MSG resulted in the item being archived.

In addition, there was a conenctivity error of storage to EMC, caused by a faulty harddisk, and some 'not known to us' changed network settings. This was fixed.

The issue was resolved as follows. As we have spare journal mailboxes, we changed Exchange stores to start using the spares. We set the 'pending shortcut timeout' to 0, and then ran the Journal Archiving tasks handling the huge journal mailboxes in reportmode. This took about 2 days to turn all 'pending' items back to normal items.

We then restarted the tasks in normal mode, and monitored the mailboxes. Items were being archived rapidly (as the journalmailboxes were static (ie no new items being added)), this was relatively quick.

Currently everything is back to 'normal'

Thanks to all for puttng in your thoughts, it helped resolving the issue.

 

Regards. Gertjan