Exchange 2010 DAG backup, logs not truncating
or at least only occasionally, with no way that's apparent to me to replicate for testing purposes. Occasionally they do or have truncated, but I can't find any rhyme or reason. We are just moving our 2007 environment to 2010, so we only have a handful of mailboxes in the 2010 environment. Which is why I'm trying to get all this problem resolved so that we can begin moving people over to 2010.
Right now for example, 1 of the databases has logs going back to the 2/13 at 9:01AM, even though I've run a successfull full backup many times.
Our configuration is as follows:
- We are NOT using Instant Recovery or GRT.
- Backups are successful with status of 0 (I'm doing "Full" backups for the purpose of testing log truncation)
- I've tried backing up both the Active and Passive nodes, and it doesn't seem to make a difference.
- In Exchange, Copy length and Queue length are 0
- I read stuff about a bug w/ the Indexing Service on Exchange: I stopped the Indexing Service on all of the DAG nodes. No change.
- We have a MAPI network and Replication network b/t the nodes. I turned off the Firewall on the Replication Network. No change.
- I've tried dismounting and remounting the database. And I ran ESEUTIL /MH and confirmed that it was clean.
- The Exchange logs indicate the backups were successful I always see the following two events: one which says they can't truncate (225), but the next says it will once replayed (9827)
- 2/14/2012 12:34:44 PM MSExchangeIS 9827 Exchange VSS Writer
- 2/14/2012 12:34:44 PM ESE 225 ShadowCopy
- The other Event that is odd to me, but maybe it's just the way they read, are the MSExchangeRepl Event 4114. They indicate that the the logs are replicating and replaying, and that the backups are occuring, EXCEPT on the Active node, which doesn't indicate an accurate "Latest Full Backup" time; refer to the following.
Database redundancy health check passed.
Database copy: DAG4
Redundancy count: 3
Errors:
================
Summary Status
================
Name Status RealCopyQueu InspectorQue ReplayQueue CIState
e ue
---- ------ ------------ ------------ ----------- -------
DAG4\WTP-EX-M Healthy 0 0 0 Healthy
B3
DAG4\WTP-EX-M Mounted 0 0 0 Healthy
B2
DAG4\WTP-EX-M Healthy 0 0 0 Healthy
B1
===============
Full Status
===============
Identity : DAG4\WTP-EX-MB3
Name : DAG4\WTP-EX-MB3
DatabaseName : DAG4
Status : Healthy
MailboxServer : WTP-EX-MB3
ActiveDatabaseCopy : wtp-ex-mb2
ActivationSuspended : False
ActionInitiator : Administrator
ErrorMessage :
ErrorEventId :
ExtendedErrorInfo :
SuspendComment :
SinglePageRestore : 0
ContentIndexState : Healthy
ContentIndexErrorMessage :
CopyQueueLength : 0
ReplayQueueLength : 0
LatestAvailableLogTime : 2/15/2012 9:29:08 AM
LastCopyNotificationedLogTime : 2/15/2012 9:29:08 AM
LastCopiedLogTime : 2/15/2012 9:17:22 AM
LastInspectedLogTime : 2/15/2012 9:17:22 AM
LastReplayedLogTime : 2/15/2012 9:17:22 AM
LastLogGenerated : 944
LastLogCopyNotified : 944
LastLogCopied : 944
LastLogInspected : 944
LastLogReplayed : 944
LogsReplayedSinceInstanceStart : 13
LogsCopiedSinceInstanceStart : 12
LatestFullBackupTime : 2/14/2012 9:46:03 PM
LatestIncrementalBackupTime : 2/15/2012 3:04:23 AM
LatestDifferentialBackupTime :
LatestCopyBackupTime :
SnapshotBackup : True
SnapshotLatestFullBackup : True
SnapshotLatestIncrementalBackup : True
SnapshotLatestDifferentialBackup :
SnapshotLatestCopyBackup :
LogReplayQueueIncreasing : False
LogCopyQueueIncreasing : False
OutstandingDumpsterRequests : {}
OutgoingConnections :
IncomingLogCopyingNetwork :
SeedingNetwork :
ActiveCopy : False
Identity : DAG4\WTP-EX-MB2
Name : DAG4\WTP-EX-MB2
DatabaseName : DAG4
Status : Mounted
MailboxServer : WTP-EX-MB2
ActiveDatabaseCopy : wtp-ex-mb2
ActivationSuspended : False
ActionInitiator : Service
ErrorMessage :
ErrorEventId :
ExtendedErrorInfo :
SuspendComment :
SinglePageRestore : 0
ContentIndexState : Healthy
ContentIndexErrorMessage :
CopyQueueLength : 0
ReplayQueueLength : 0
LatestAvailableLogTime :
LastCopyNotificationedLogTime :
LastCopiedLogTime :
LastInspectedLogTime :
LastReplayedLogTime :
LastLogGenerated : 0
LastLogCopyNotified : 0
LastLogCopied : 0
LastLogInspected : 0
LastLogReplayed : 0
LogsReplayedSinceInstanceStart : 0
LogsCopiedSinceInstanceStart : 0
LatestFullBackupTime :
LatestIncrementalBackupTime :
LatestDifferentialBackupTime :
LatestCopyBackupTime :
SnapshotBackup :
SnapshotLatestFullBackup :
SnapshotLatestIncrementalBackup :
SnapshotLatestDifferentialBackup :
SnapshotLatestCopyBackup :
LogReplayQueueIncreasing : False
LogCopyQueueIncreasing : False
OutstandingDumpsterRequests : {}
OutgoingConnections :
IncomingLogCopyingNetwork :
SeedingNetwork :
ActiveCopy : True
Identity : DAG4\WTP-EX-MB1
Name : DAG4\WTP-EX-MB1
DatabaseName : DAG4
Status : Healthy
MailboxServer : WTP-EX-MB1
ActiveDatabaseCopy : wtp-ex-mb2
ActivationSuspended : False
ActionInitiator : Administrator
ErrorMessage :
ErrorEventId :
ExtendedErrorInfo :
SuspendComment :
SinglePageRestore : 0
ContentIndexState : Healthy
ContentIndexErrorMessage :
CopyQueueLength : 0
ReplayQueueLength : 0
LatestAvailableLogTime : 2/15/2012 9:29:08 AM
LastCopyNotificationedLogTime : 2/15/2012 9:29:08 AM
LastCopiedLogTime : 2/15/2012 9:17:22 AM
LastInspectedLogTime : 2/15/2012 9:17:22 AM
LastReplayedLogTime : 2/15/2012 9:17:22 AM
LastLogGenerated : 944
LastLogCopyNotified : 944
LastLogCopied : 944
LastLogInspected : 944
LastLogReplayed : 944
LogsReplayedSinceInstanceStart : 13
LogsCopiedSinceInstanceStart : 12
LatestFullBackupTime : 2/14/2012 9:46:03 PM
LatestIncrementalBackupTime : 2/15/2012 3:04:23 AM
LatestDifferentialBackupTime :
LatestCopyBackupTime :
SnapshotBackup : True
SnapshotLatestFullBackup : True
SnapshotLatestIncrementalBackup : True
SnapshotLatestDifferentialBackup :
SnapshotLatestCopyBackup :
LogReplayQueueIncreasing : False
LogCopyQueueIncreasing : False
OutstandingDumpsterRequests : {}
OutgoingConnections :
IncomingLogCopyingNetwork :
SeedingNetwork :
ActiveCopy : False
Although the following thread was never concluded concretely in my opinion, I think this he/she describes what we were mistaking as a problem:
https://www-secure.symantec.com/connect/forums/exchange-2010-sp1-and-nbu-70-logs-not-truncating
After some more reading, the "Checkpoint" appears to essentially be a minimum number of logs that Exchange will retain. Since our 2010 environment was brand new and we'd only moved a couple of users to it, we weren't generating enough logs to reliably test truncation after each backup. This also explains the seemingly intermittent and odd numbers of truncated logs that we were seeing.
So, after moving some more users into the 2010 environment (which generated more logs), it appears to be working as described by the Checkpoint literature. Also, it appears to be working regardless of whether we backedup from the passive or active nodes. I'll update this thread if anything changes.