Solved: Not in a cluster and not

Latro · ‎08-31-2012

The same policy does truncate logs during the weekends but never during weekdays, weekday back ups always hit time window (4hrs) and finish usually more than hour earlier than time window ends.

It used to work perfectly until recently, nothing has changed on either back up nor Exchange server. I recreated back up policy from scratch thinking maybe policy is corrupt but there is no difference.

There are other back up jobs with pretty much same settings backing up different Information Stores (on the same server) and successfully truncating logs.

Few details about environment, logs etc.

NOT using snapshot back up or Instant Recovery (someone suggested to start using snapshot back ups but it worked erlier without this option enabled)
Single Exchange 2007 server, Windows 2003, not replicating nor clustered etc
Backups are successful with status of 0 (Full back ups)
Master server is running NBU 7.0, mail server has 7.0.1 client installed
The Exchange logs indicate the backups were successful, I always see the following events: 221, 223, 213 (ending back up of edb file, starting back up of log files and back up successfully completed)
BPBKAR log files states (can't figure out what is causing this):

12:20:24.223: [4328.9084] <16> dtcp_read: TCP - failure: recv socket (432) (TCP 10054: Connection reset by peer)
12:20:24.223: [4328.9084] <2> tar_base::V_vTarMsgW: WRN - Abnormal backup status, Exchange Transaction Logs were not truncated

Latro · ‎09-04-2012

Apologies for late reply.

I have changed time window to start back up job earlier so it starts when server is not busy with everyday tasks - and logs have been truncated today. I have observed earlier similar behaviour when logs occassionally were truncated so it needs more time to see if this is permanent or just one off.

I will monitor it and if the issue happens again I will post bkbkar/bpbrm logs files.

View solution in original post

Nicolai · ‎08-31-2012

CCR cluster ?. Then check log replication. Exchange will not truncate the log if the replication doesn't work as intended.

Latro · ‎08-31-2012

Not in a cluster and not replicating, so this is not it.

Marianne · ‎08-31-2012

You are correct that this is preventing log truncation: dtcp_read: TCP - failure: recv socket (432) (TCP 10054: Connection reset by peer) Please check bpbrm log on media server for possible timeout. What is Client Read Timeout on media server?

Handy NetBackup Links

Latro · ‎08-31-2012

Client Read Timeout is 600 sec. on client server it was 300, I changed it to 600sec so it matches Media Server. Same with File Browse timeout, it's up to 500 from 200 sec on client server.

In regards to bpbrm logs - I do not see anything about connection time out or reset.

Just in case bits from today's back up:

BPBKAR on client server:

12:24:47.056: [7632.4696] <2> tar_base::V_vTarMsgW: INF - Total Compressed Size:1057791821 + 69GB
12:24:47.056: [7632.4696] <2> tar_base::V_vTarMsgW: INF - Total Size:743975342 + 111GB
12:24:47.056: [7632.4696] <4> ov_log::OVLoop: Timestamp
12:24:47.056: [7632.4696] <4> tar_backup::backup_done_state: INF - number of file directives not found: 0
12:24:47.056: [7632.4696] <4> tar_backup::backup_done_state: INF - number of file directives found: 1
12:24:47.056: [7632.4696] <2> tar_base::V_vTarMsgW: INF - Client completed sending data for backup
12:24:47.056: [7632.4696] <4> tar_base::stopKeepaliveThread: INF - waiting for keepalive thread to exit ...
12:24:47.056: [7632.1792] <4> tar_base::keepaliveThread: INF - keepalive thread terminating (reason: WAIT_OBJECT_0)
12:24:47.056: [7632.4696] <4> tar_base::stopKeepaliveThread: INF - keepalive thread has exited. (reason: WAIT_OBJECT_0)
12:24:47.056: [7632.4696] <2> tar_base::V_vTarMsgW: INF - EXIT STATUS 0: the requested operation was successfully completed
12:24:47.056: [7632.4696] <8> tar_backup_tfi::cleanupTemp: WRN - will not cleanup 'temp' directory, debug level(s) are too high
12:24:47.056: [7632.4696] <4> bpio::read_string: INF - read non-blocking message of length 1
12:24:47.056: [7632.4696] <4> tar_backup::readServerMessage: INF - keepalive message received
12:24:50.056: [7632.4696] <4> bpio::read_string: INF - read non-blocking message of length 1
12:24:50.056: [7632.4696] <4> tar_backup::readServerMessage: INF - keepalive message received

12:26:50.053: [7632.4696] <16> dtcp_read: TCP - failure: recv socket (420) (TCP 10054: Connection reset by peer)
12:26:50.053: [7632.4696] <2> tar_base::V_vTarMsgW: WRN - Abnormal backup status, Exchange Transaction Logs were not truncated
12:26:50.053: [7632.4696] <4> tar_backup::backup_done_state: INF - Backup finished at 31/08/2012 10:27:04
12:26:50.053: [7632.4696] <4> dos_backup::tfs_reset: INF - Snapshot deletion start
12:26:50.053: [7632.4696] <4> dos_backup::V_Exchg2kBackupEnd: INF - cleaning up

BPBRM logs on master server:

12:24:46.874 [7444.5992] <2> bpbrm main: from client mbx1-lon: INF - Total Compressed Size:1057791821 + 69GB
12:24:46.874 [7444.5992] <2> ConnectionCache::connectAndCache: Acquiring new connection for host harry, query type 78
12:24:46.952 [7444.5992] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:24:46.952 [7444.5992] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpdbm
12:24:46.983 [7444.5992] <2> logconnections: BPDBM CONNECT FROM 10.1.104.78.2721 TO 10.1.104.78.13724
12:24:47.671 [7444.5992] <2> db_end: Need to collect reply
12:24:48.811 [7444.5992] <2> bpbrm wait_for_child: start
12:25:00.170 [7444.5992] <2> bpbrm wait_for_child: child exit_status = 0
12:25:00.170 [7444.5992] <2> bpbrm wait_for_child: child exit normal
12:25:00.170 [7444.5992] <2> bpbrm main: validating image for client mbx1-lon
12:25:00.170 [7444.5992] <2> ConnectionCache::connectAndCache: Acquiring new connection for host harry, query type 74
12:25:00.280 [7444.5992] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:25:00.280 [7444.5992] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpdbm
12:25:00.280 [7444.5992] <2> logconnections: BPDBM CONNECT FROM 10.1.104.78.2732 TO 10.1.104.78.13724
12:25:00.514 [7444.5992] <2> db_end: Need to collect reply

12:30:09.897 [2576.10160] <2> bpbrm main: INITIATING (VERBOSE = 0): version NetBackup 7.0 0

Marianne · ‎08-31-2012

Is master server also the media server for this backup?

Please post entire bpbkar and bpbrm logs file as File attachments (rename them to process name, e.g. bpbrm.txt).

Handy NetBackup Links

Latro · ‎09-04-2012

Apologies for late reply.

I have changed time window to start back up job earlier so it starts when server is not busy with everyday tasks - and logs have been truncated today. I have observed earlier similar behaviour when logs occassionally were truncated so it needs more time to see if this is permanent or just one off.

I will monitor it and if the issue happens again I will post bkbkar/bpbrm logs files.

Latro · ‎09-07-2012

Logs have been truncated for 4 consecutive back up days which did not happen for few weeks now. At this point, this is resolution I was looking for.

V4 · ‎09-07-2012

Also sync up with MS team on this timely behaviour. Backup application would only send instruction/notification to Exchange of successful backup completion. Thereafter it's Exchange's responsibility to get logs truncated

Madhusudana_D · ‎09-08-2012

Just fyi,

We had same issue that for one of the Stores were not getting truncated after the successful incremental backup.

Logs were not truncated as Exchange team guys have deleted the logs manually for that store due to space issue.

The truncation of the transaction logs sets the context for the next backup. Manual deletion removes the context and netbackup fails to truncate during the consecutive incremental backup attempts till the next Full backup. So please do not delete the logs manualy, it has to be deleted only by the backup job

NBU_TSE · ‎09-26-2012

Hi Latro,

Is this problem now resolved, if so please mark as solved. I guess the solution was to move the backup times?

Latro · ‎09-26-2012

Hi,

Yes, extending time windows resolved the issue.

Thanks.

VOX

Exchange 2007 logs not truncated, NetBackup 7.0