I'm trying to find a good workflow technote on EV journalling to help troubleshoot a slow journal archiving scenario.
Is there anything around that could help?
Essentially we're testing a single journal arhcive that recieves about 4 messages a second (about 15k per hour). I was expecting the EV server to keep pace with this over a 24Hr period but finding that even after 1 hr it falls woefully behind and before long the number of IPM.Note messages raises above 1000 which has the effect of causeing the pending items never to be removed from the journal mailbox - Symantec say this is a hard-coded limit - which subsequently causes journal issues with Exchange.
A Dtrace run by Symantec has shown that the EV server is archiving about 1 item every 2 seconds
Symantec have offered several opinions - They said the issue is with the Hub Transport server causing a bottle neck as Exchange passes the messages back through the HT server when EV archives the items....!? really? I can't see any connection with archiving items from the Journal mailbox and the HT server - surely once the email is in the journal mailbox the HT server is no longer in the equation - but happy to stand corrected.
They also say that another EV Journal server could help - but looking at the resources of the current journal server it's bearly breaking a sweat when looking at CPU and RAM.
Also the SQL server is also good as far as CPU/RAM/Network goes.
Should I be concentrating on Disk IO (EVserver or SQL server?)? Message Queues (which ones)? Disk IO on message Queue disk?
From what I can see visually the items change from IPM.Note to Pending Archive pretty quickly but then spend an age being deleted from the mailbox - so what is EV doing during the part of the process?
There are probably lots of things that can cause archiving to be slow so I guess I'm just after a set of logical steps to narrow things down - hence why I thought some kind of process flow would be a good place to start.
thanks in advance
Solved! Go to Solution.
Which version of Exchange?
I think HT can be discounted, if HT were the cause I'd imagine that the mail wouldn't reach the mailbox in a timely manner, how about CAS servers though? utilization? any signs of them dropping connections? Is there a CAS array, load balancer etc?
Check the journal mailbox for an abundance of failed messages, if there are a lot, I've seen this caused by poor connectivity between EV and Exchange due to CAS and Load Balancer issues, also check the event log for clues, is EV 'sleeping' for 60 second intervals etc?
To rule out the CAS array and load balancer, try and edit the host file on the EV journal server to point to a specific CAS that is perhaps being utilised less than the rest.
I'd also tinker with the Journal task settings, no. of connections, items per pass, start off fairly low (200 or so and 5 connections), also there are a number of settings in the journal policy to tinker with.
Other places to look; how's the GC, is its performance ok?
there's some sensible advice and suggestions here:
Generally in my environment (4 journal mailboxes and 4 EV servers) we run about 100,000 emails per hour with peaks of 200,000. That's 25-50K emails per hour per server. The time between initial copy (change to pending shortcut) and finish (remove shortcuts) there is a 7 minute delay, occassionally much longer (30-45 minutes). I've had up to 100,000 emails in a single journal mailbox which eventually cleared (during non-peak hours).
All this being said, I run the default # of connections on the archvive task, and storage tasks. I have attempted to increase the Storage threads, but this resulted in a negative performance so I reverted to defauit on them. AV scanning can be a big issue and if installed should be disabled long enough to validate any impact. HT has no effect. DC can impact so you might want to specify a DC as a test.
Thanks all - some good info here.
I'm hoping we've nailed it as we discovered some mailbox scanning going on that when we stopped improved performance immensely - basically it fixed the issue. Test will continue, but it's looking good so far.
But thanks to everyone - I think basically Andy hit the target with his first email.
I do however have a bit of a followup question (related) which someone may be able to answer easily for me:
I've read several posts where people have increased the MSMQ size to 8Gb - Which is done from the 'general' tab in the properties of the MSMQ feature in 'computer management'
In Windows 2K8 there are two limts one is 'storage' (which is already set to 8Gb by default) and the other is 'journal' and is set to 1Gb by default - is it the 1Gb 'journal' related limit that is recommended to be increased to 8Gb and is this a best practice to do anyway or should only be adjusted if advised by Symantec support?
Although closed, response on MSMQ journal
MSMQ Journal is ONLY used if you enable Journaling on MSMQ. You enable journaling per queue (rightclick any of the EV private queues, on General, at the bottom, Journal - Enabled (unticked!)
You really do not want this until specifically asked by Support. It will fill up rapidly. The Journal Queue itself is under the System queues.
So, it cannot harm to extend the size for the Journal, but it is not necessary. As Andrew states, it is best practice to set it for both.
I'm perhaps still a little confused - so the limit associated with the MSMQ journal setting has nothing to do with the EV journal queues but it instead refers to the actual Journal for the MSMQs themselves (I can get that bit)....and enabling MSMQ Journalling isn't a good idea unless Symantec support say to do this as it fills up quick....but it's still a good practive to change the setting from 1Gb to 8Gb anyway? - is that just in case Symantec say to enable it?
But I've seen to have seen a couple of posts where users appear to have 'fixed' slow journal mailbox archiving by increasing this figure and also some posts whereby Alex has spoken about 'hitting the 1Gb limit'
https://www-secure.symantec.com/connect/forums/journal-archiving-very-slow (final post)
https://www-secure.symantec.com/connect/forums/journaling-mailbox-keeps-increasing (see Alex's bit on 'breeching the 1Gb limit)
I still think there is a concept here that I'm mis-understanding........
I've started a new thread with this now as I'm trying to research the archive queues a little more closely to help me understand the inner workings.