03-21-2013 08:27 AM
Hello all,
I am out of ideas, and possibly someone of you has an idea.
Journal Arciving, target is EMC Centera. 5 journal archviving servers. 3 work, 2 don't. The ones that do not, in the journalmailboxes handled by these have 10000's items in PendingArchive state. I do not see them going to pendingarchiving.part AFAIK nothing has changed on the network.
msmq seems to be working. There are some, but not many, large messages.
We've restarted services/rebooted. on one of the journal mailboxes, i used docmessageclass to return the messageclass to im.note. then the are archived.
any ideas?
thx,
Gertjan
Solved! Go to Solution.
03-21-2013 08:51 AM
Does it keep occuring?
Also with the centera, is that using Collections at all?
And after you use docmessageclass to revert them, did you restart any tasks or services?
The vault store that they're writing to, is it set to After Backup or Immediately after archive?
Does the centera have any replicas?
Does each EV server write to the same archive and same vault store?
Or do you have individual vault stores for each server which have Centera partitions for each?
If its different vault stores, different partitions, do they all use the same Centera node ip address?
What about the PEA files, are they all the same?
If it happens again, find the oldest item in a pending state, have a look and see in Browser Search as to whether the item has actually been archived and indexed and its just awaiting to be post processed.
Also i'm guessing theres nothing in the event logs giving any clues?
And the MSMQ Quota hasn't been breached at any point?
My best guess is that theres something in the background going on where its sleeping for one minute, wakes up, an issue occurs, goes back to sleep for a minute etc....you then restart the task and suddenly it starts working
03-21-2013 08:48 AM
03-21-2013 08:51 AM
Does it keep occuring?
Also with the centera, is that using Collections at all?
And after you use docmessageclass to revert them, did you restart any tasks or services?
The vault store that they're writing to, is it set to After Backup or Immediately after archive?
Does the centera have any replicas?
Does each EV server write to the same archive and same vault store?
Or do you have individual vault stores for each server which have Centera partitions for each?
If its different vault stores, different partitions, do they all use the same Centera node ip address?
What about the PEA files, are they all the same?
If it happens again, find the oldest item in a pending state, have a look and see in Browser Search as to whether the item has actually been archived and indexed and its just awaiting to be post processed.
Also i'm guessing theres nothing in the event logs giving any clues?
And the MSMQ Quota hasn't been breached at any point?
My best guess is that theres something in the background going on where its sleeping for one minute, wakes up, an issue occurs, goes back to sleep for a minute etc....you then restart the task and suddenly it starts working
03-22-2013 01:28 AM
@mmcr, I suspect something like this, but am until now not able to find an item that might be causing this.
@ JW,
03-22-2013 01:59 AM
Hm....
I am investigating items that are being moved to the 'failed to copy' folder in the Journal Mailbox.
all messages in here have an the mail to be journaled another mail that is empty. it is either an attachment (msg file) or a msg file that seems to be dragged into the original mail.
No sender, recipient, subject or body. When opening it states : This message has not been sent.
Strange. Checking furhter
03-25-2013 06:12 AM
Fixed..
This was a combination of a mail-flood, EMC connectivity errors, and corrupted mail.
Initially we noticed that 3 Journal Mailboxes received about 100.000 mails in 1 hour, resulting from a faulty workflow in Sharepoint. Some of these messages were corrupt. The corrupt messages had MSG files attached or embedded in the original message that were empty (no sender/recipient/subject/body). Deleting that specific MSG resulted in the item being archived.
In addition, there was a conenctivity error of storage to EMC, caused by a faulty harddisk, and some 'not known to us' changed network settings. This was fixed.
The issue was resolved as follows. As we have spare journal mailboxes, we changed Exchange stores to start using the spares. We set the 'pending shortcut timeout' to 0, and then ran the Journal Archiving tasks handling the huge journal mailboxes in reportmode. This took about 2 days to turn all 'pending' items back to normal items.
We then restarted the tasks in normal mode, and monitored the mailboxes. Items were being archived rapidly (as the journalmailboxes were static (ie no new items being added)), this was relatively quick.
Currently everything is back to 'normal'
Thanks to all for puttng in your thoughts, it helped resolving the issue.
Does it keep occuring? - not anymore. Some actions were done on the Centera, which seem to have sort of fixed it. I've changed the journalmailboxes on the stores, and those 'new' journalmailboxes are being archived.
Also with the centera, is that using Collections at all? Yes it is.
And after you use docmessageclass to revert them, did you restart any tasks or services? yes.
The vault store that they're writing to, is it set to After Backup or Immediately after archive? Immediately after archving
Does the centera have any replicas? yes.
Does each EV server write to the same archive and same vault store? Each server writes to the same store, but to different archives. We have 1 archive for each Exchange server (ie server1 7 stores, 3 tasks, write to archive1 in vs1 etc)
Or do you have individual vault stores for each server which have Centera partitions for each?each server has it's own store, in an dedicated centera partition.
If its different vault stores, different partitions, do they all use the same Centera node ip address? same node address.
What about the PEA files, are they all the same? 'they'... we use 1 PEA file, which is stored on a share, which is accessible by all ev-servers
If it happens again, find the oldest item in a pending state, have a look and see in Browser Search as to whether the item has actually been archived and indexed and its just awaiting to be post processed.
will try
Also i'm guessing theres nothing in the event logs giving any clues? nop. Although there is an issue with the centera (we enabled storage expiry, and it is expiring many items (millions), which causes storage issues.
And the MSMQ Quota hasn't been breached at any point? not as far as I can determine. Quoate is 20GB
My best guess is that theres something in the background going on where its sleeping for one minute, wakes up, an issue occurs, goes back to sleep for a minute etc....you then restart the task and suddenly it starts working - that is my guess too, I am trying to figure out wtf.
Thanks for your ideas. They do give some direction.