cancel
Showing results for 
Search instead for 
Did you mean: 

Managing EV archive having 12 exch2013 nodes - problems for EV archiving when the ExchDBs move around.

Sani_B
Level 6
Partner Accredited

Hi,

We have one client who has 48 exchange DBs divided to Exchange 2013 DAG having 12 nodes. Each ExchDB has active and passive version so that if some problem occurrs the other node takes over the DB.

I have 2 EV server cluster to cover these 12 tasks, 6 each.

 

The behaviour that I'm seeing is that the automatic archiving is slow or nonexistent apparently due to the fact that when the A5 queue is loaded with the mboxes of that time it handles a portion. Then the scheluded time is over and task stops it work. If the DB now moves to another ExchNode the unhandled archive requests stays in A5 queue and when the next scheduled run starts it halts to wonder these queued requests that are no longer on that node that it's supposed to handle and the automatic archiving / mailbox synchronizing is extremely slow or nonexistent and the MSMQs just never seems to get done... Is there a solution for this behavior?

Found this forum discussion:

https://www-secure.symantec.com/connect/forums/mailbox-schedule-archiving-halts-after-moving-mailbox...

 

What is SyncinMirgationMode registry key  mentioned in there?

 

Environmental facts:

Around 44 000 mailboxes - around 10 000 Enabled for EV

 

 

Exchange 2013 CF5

DAG 12 nodes

48 Exchange DB

 

Enterprise vault 11.0.0

2 clustered (active - passive) EV servers - Server 2008 R2

Each EV server handles 6 exchange nodes

 

Sani B.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Sani_B
Level 6
Partner Accredited

Just so you know the environment is working fine now that exchange is version 2013 CU8, Enterprise Vault 11.0.1 CHF 1 and on the ev server is Outlook 2013 (the supported version).

After those were updated I deleted all the tasks and msmq and target exchange servers - added all the exchange servers using FQDN (was originally added using shortname).

 

After this the EV started to function as it should!! :)

 

Over a year it took but now it works! yay! :)

 

Sani B.

View solution in original post

17 REPLIES 17

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello Sani,

Is it possible to run the provisioning task before the scheduled run kicks in again?

Provisioning updates the mailbox-location, and that might help to get the queue moving properly, not sure though.

As for the SynchInMigrationMode key, that comes into play when you move mailboxes to an exchange server and do not want new archives to be created. From KB: http://www.symantec.com/docs/HOWTO98184

Specifies whether, when migrating mailboxes from one Exchange Server to another, Enterprise Vault automatically assigns migrated mailboxes to existing archives. SynchInMigrationMode affects only the association between user mailboxes and archives; it does not affect the association between journal mailboxes and their archives.

 

Regards. Gertjan

Sani_B
Level 6
Partner Accredited

Hi Gertjana,

I'll try and see if running the provisioning task before the schedulet archiving tasks / mbox synchronizing has affect on this problem, thank you.

Sani B.

JesusWept3
Level 6
Partner Accredited Certified
Are all those exchange servers active btw? 6 exchange servers per is a little much for one EV server
https://www.linkedin.com/in/alex-allen-turl-07370146

WiTSend
Level 6
Partner

Question:  Do you have a separate EV System mailbox on each server and is it on a non-replicated DB?

Sani_B
Level 6
Partner Accredited

All exchange servers are active and there's two clusterd ev servers handling 6 each.

Yes there is a system mbox for each exchange node but those are among the other mboxes and also move around sometime... I didn't know you could have both - replicated and non replicated dbs in exchange?

 

Running the provisioning right before scheduled archive run and also second time before the synchronize seem to be reduced the problem some but it's not solved... Synchronizing is still extremely slow and manual synchronizing a mailbox seem to fail most of the time of a timeout and that's also when 7206 event appear too. R1 and R2 queues seem to still get some "ghost" items occationally... Tried to dtrace the synchronizing but it didn't catch anything...

 

Sani B.

 

Sani_B
Level 6
Partner Accredited

This morning I can actually see items in couple R2 queues and the date on them says yesterday afternoon...

I restarted the task controller service and it gives me plenty of 7206 and 3230 and starts the carousel with 3430 restarting the tasks over and over...

 

The 3230 events

Could not create a new MAPI session on Exchange server 'EVserver5'.

The most likely cause is a connection issue that is causing excessive delays in the task. This can often be recovered by restarting the MAPI related tasks.

Internal References:

Unable to get exclusive access to the MAPI thread pool.

Mutex name: Lie mode lock

Reason: Last lock holder: Process <11612> <RetrievalTask> Thread <10692>

 

All says always RetrievalTask and when I go see the on going processes there is no process on with that number and the items cannot be seen opening the R2 queue anymore but it's still showing that there is items in the queue in the main pain view...

Sani B.

Trafford
Level 4
Partner Accredited

Hi Sani,

I can sympathise with you. We hvae just gone through a similar situation with our 4x Exch 2010/ 6x 2013, 4 x EV10 environment.

The technote you mnentioned deals wit the situation where mbx are being migrated from old exchange to new exchange, To maintian links between Mbxs and archives EV has Synch In Migration Mode.

See - HowTo32318-SynchInMigrationMode

What we did :-

Create a new database on each Exchange Server, DO NOT Enable it in a DAG or for failover, to ensure it lives and dies with it's server, Create Your EV System Mailboxes in these DBs. This at least keeps EV connection with the server alive.

The Throttling Policy and NSPI connections can severely restrict EV working with AD & Exchange, Make sure these are set correctly. See

HOWTO97542-EV11-Configuring the Exchange throttling policy on the Vault Service account

TECH73507-Win 2008&12 DCs restrict NSPI connections-can cause EV archiving tasks to fail

Check through this troubleshooting guide to check that environment is good

TECH35774-Troubleshooting Exchange connectivity issues with EV

After all the troublshooting we re-created our Tasks, and cleaned up all the orphaned Queues (Deleting the task does NOT delete it's queues, and New queues are created for the New Task)

TECH38085-How to purge Microsoft Message Queues (MSMQ) used by Enterprise Vault.

We set our Site and/or task schedules for 15 minute intervals.

Set your Archving to run 15 minutes (or suitable time) after a Provisioning.

EG if you want to start archving at 18:00 set a Provisioning Run for 18:00 and then start the archiving at 18:15 or 18:30

We also discovered that the Archving Task performs a Synchronization at the beginning of every run, so it is not so critical to time your scheduled Synchronisations

Make sure your Archving window does not overlap with EV Backups, SQL Server backups & Maintenance, Exchange Maintenance, AV Scans etc. All these can interfere with Archiving tasks, and result in orphaned items in queues

Unfortunately it seems that if a DB fails over during and archving run then it can cause problems with EV, leaving items in queueus etc as you have discovered. As long as EV targets ExchServers and not DBs, I think this is something we will have to live with.

We also discovered that if an Archving Task has NO targets (EG they all failed over somewhere else) the the task will not start an archving run, it leaves no messages in the event log, and does not generate report. This might lead you to thing the task has failed, but just had nothing todo.

We are considering adding a dummy mailbox to the "Fixed" db so that at least the task always has at least one target, and so will always run.

You might consider some Tuning of the Archiving Task 

Number of concurrent connectios to exchange :- Defaults to 5 we reduced ours 2 which eliminated task failures, but slowed down archiving. we put it up to 3, more archiving, only 1 task failure so far.

Max items per target per pass:- Defaults to 1000, we reduced ours to the minimum 200. EV then spsends less time in each  mbx, ensuring that it gets through all of them in each archiving run, achieving a better levelling effect, and reducing the number of users hitting mbx or achiving limits.

Our Archiving is now running smoothly,  Archiving windows is 08:15 to 02:45, We are pulling 30-50GB per task per run.

Hope this helps abit, and good luck 

Sani_B
Level 6
Partner Accredited

Excellent post Trafford, thank you! Most of the content I've already checked/done but this bit I will have to discuss with the exchange managers:

"Create a new database on each Exchange Server, DO NOT Enable it in a DAG or for failover, to ensure it lives and dies with it's server, Create Your EV System Mailboxes in these DBs. This at least keeps EV connection with the server alive."

 

I have additional question though:

It looks like the archiving run puts every mailbox that exists on a node to the A5 queue, not just the enabled ones... Is this intended to work this way now in EV version 11 or does anyone know a possible cause for this kind of behaviour? I thought it should only queue mailboxes that has archive enabled??

Sani B.

 

 

WiTSend
Level 6
Partner

All mailboxes on the server are placed in the queue.  EV opens each mailbox, reads the hidden message and from that determines whether or not and what to archive based on it's status and policies.

Sani_B
Level 6
Partner Accredited

Are you sure it's putting all mboxes to the queue and not first reading the hidden message and then it should put it to queue A5 if it's enabled...?

 

Sani B.

Sani_B
Level 6
Partner Accredited

Hi,

In A5 queue there are items for user accounts that don't have archive enabled.

This is how I know it:

The label of the item in A5 tell the user account it is for. (user1)

In SQL I run a query:

USE EnterpriseVaultDirectory
select *
FROM ExchangeMailboxEntry
Where MbxNTUser = 'user1'

I see a line there that states for this user1:

MbxArchivingState = 0
MbxExchangeState = 0
ExchangeMbxType = 1

Any thoughts on this...?

WiTSend
Level 6
Partner

The A5 queue is populated with all mailboxes known to be on the associated server.  Each mailbox is opened and the hidden message read to determine the archive policies.  DIsabled mailboxes are included in this process.  It is not until the hidden message in the mailbox is opened that the archive task knows with to do with it.

Sani_B
Level 6
Partner Accredited

Okay so we had provisionin group that was targeting the whole organization - without enabling the archive automatically to all - they had to be enabled manually by the EV admin from the EV server.

I made a AD group where I put all the users that had been enabled manually - ran the provisioning task to transfer the users to other provisioning group that has the automatically enabling archive on but only targeting this AD group - so that there was no enabled archives belonging to the procisioning task targeting the whole organization.

I then deleted the "whole organization" provisioning group - and voilá!! Only the users who have been enabled for archiving is now making an entrance to the queues A5 & A7 when either archiving task run or synchronising run is started...

So to summarize everything that is targeted by provisioning regardless if it's enabled or not for archiving - it will make an entry to the MSMQ when the tasks run.

 

Sani B.

Sani_B
Level 6
Partner Accredited

Now that the queues are not that full anymore - it might also have been the solution to my original problem / question about the mboxes moving around etc.

 

The system mailboxes got transferd to their own little stationary db's on each node, so those won't move around anymore - that's settled.

 

I'm still monitoring how the ev environment will behave with these changes and come and let you know if it's now functioning properly or if same / new or what ever issues still exist at the start of next week.

 

Cheers for all your help so far!!

 

Sani B.

GertjanA
Moderator
Moderator
Partner    VIP    Accredited Certified

Hello Sani,

Thanks for keeping us updated. It is good to see you tackled the problem, and (probably) have resolved it.

 

Regards. Gertjan

Sani_B
Level 6
Partner Accredited

This is a very frustrating environment! The ques are processing really slow and there are some weird pauses when the scheduled archiving is happening. Looking the report it might say it's been on the whole 7 hours it's schedulet to run, but then that it's processed only a hand full of mboxes... It might have processed like 5 mboxes and then it just doesn't do anything for some time (could be 10min, or 5 hours) and then it does some mboxes untill the time's up and it stops... Dtrace is unable to tell why these delays happen... I'm thinking - could it be that when the process for the first few mboxes reaches the states where it put's items to a1 queue to the processed items icons in the mailboxes to be changed to fully archived - is it possible that it's causing the processing of the mboxes to somehow pause and these gaps are happening?

 

Also I'd like an environmental advice:

There are two EV servers - the EV1 handles alone stroraging / indexing and 6 exchange servers tasks. The second EV2 was added to get more connections against the exchange to handle 6 more exchange servers

(so there are 12 exchange 2013 servers total). Can the fact that the first server EV1 handles alone both storage and indexing be the reason the msmqs are being processed excruciatingly slow?

 

I've already talked about adding the indexing (and I guess another stroraging service would have to be added too as the recommendation is that those run on the same ev server) but the symantec support didn't think it was such a good idea... something to do about how the old indexes have been made with the first one and if I wanted to add another a whole lot of db changes would have to be made and that was shot down... But I didn't fully grasp why since there is an article that suggests maybe adding another indexing... ? http://www.symantec.com/docs/HOWTO97746

 

Is my environment too small to handle the load?

 

12 exch2013 servers

40 000 users, but only around 8000 enabled for archiving using provisioning targeting windows groups

 

EV1 server (active/passive cluster) handles:

Storage service

Indexing service

Shopping service

6 exchange tasks (archiving task connections 4 / retrieval task connections 2)

 

EV2 server  (active/passive cluster) handles:

6 exchange tasks (archiving task connections 4 / retrieval task connections 2)

EV2 server has the all the component services installed but not put to use yet as it was supposed to help to get more connection possibilities towards the exchange (the thread account restrictions) .

 

Sani B.

 

 

 

 

Sani_B
Level 6
Partner Accredited

Just so you know the environment is working fine now that exchange is version 2013 CU8, Enterprise Vault 11.0.1 CHF 1 and on the ev server is Outlook 2013 (the supported version).

After those were updated I deleted all the tasks and msmq and target exchange servers - added all the exchange servers using FQDN (was originally added using shortname).

 

After this the EV started to function as it should!! :)

 

Over a year it took but now it works! yay! :)

 

Sani B.