cancel
Showing results for 
Search instead for 
Did you mean: 

Image cleanup is cancelling the backup jobs

Severin
Level 2

Hi,

I expected the following problem.
Sometimes, the image cleanup job is running during the run of other backup job. That have as consequence that the running backups are all cancelled:

(...)
23.06.2010 01:00:34 - begin writing
23.06.2010 03:02:18 - Critical bptm(pid=4052) image write failed: error 2060017: system call failed 
23.06.2010 03:02:22 - Error bptm(pid=4052) cannot write image to disk, Invalid argument  
23.06.2010 03:02:24 - end writing; write time: 02:01:50
media write error(84)

The image cleanup is terminate successfully:

23.06.2010 03:01:29 - Info bpdbm(pid=5480) image catalog cleanup      
23.06.2010 03:01:29 - Info bpdbm(pid=5480) deleting images which expire before Wed Jun 23 03:01:29 2010 (1277254889)
(...)
23.06.2010 03:01:31 - Info bpdbm(pid=5480) processing client netbackup-chbslsv02      
(...) 
23.06.2010 03:01:31 - Info bpdbm(pid=5480) deleted 48 expired records, compressed 0, tir removed 0, deleted 0 expired copies
(...)
23.06.2010 03:02:32 - Info nbdelete(pid=4840) deleting expired images. Media Server: netbackup-chbslsv01 Media: \\netbackup-chbsldd01\backup\chbslsv02\brbackup 
23.06.2010 03:02:32 - started process bpdm (5476)
23.06.2010 03:02:32 - Info bpdm(pid=5476) initial volume \\netbackup-chbsldd01\backup\chbslsv02\brbackup: Kbytes total capacity: 1171809792, used space: 1019736576, free space: 152073216
23.06.2010 03:02:32 - Info bpdm(pid=5476) ending volume \\netbackup-chbsldd01\backup\chbslsv02\brbackup: Kbytes total capacity: 1171809792, used space: 1019736576, free space: 152073216
(...)
the requested operation was successfully completed(0)

On the master server, the properties are set to "After 20 hours" for the Image cleanup.

Who can tell me what is going wrong? Why the cleanup job is running before the backup jobs are finished?

Thanks for your help

Séverin

1 ACCEPTED SOLUTION

Accepted Solutions

Andy_Welburn
Level 6
the image cleanup is the culprit:

"The Image cleanup property specifies the maximum interval that can elapse before an image cleanup is run. Image cleanup is run after every successful backup session (that is, a session in which at least one backup runs successfully). If a backup session exceeds this maximum interval, an image cleanup is initiated."

So if your backup session has exceeded the 20hour limit then an image cleanup will be instigated whilst the backups are running - this is normal & in no way should affect any active jobs.

You need to check your NB & system logs as to the reason behind your I/O errors (84) - I would imagine that it is coincidence that the error has happened at the same time as the image cleanup (unless you could prove otherwise? - try the job again when there is no likelihood of a cleanup starting, try the job again & manually instigate a cleanup)

View solution in original post

3 REPLIES 3

Andy_Welburn
Level 6
the image cleanup is the culprit:

"The Image cleanup property specifies the maximum interval that can elapse before an image cleanup is run. Image cleanup is run after every successful backup session (that is, a session in which at least one backup runs successfully). If a backup session exceeds this maximum interval, an image cleanup is initiated."

So if your backup session has exceeded the 20hour limit then an image cleanup will be instigated whilst the backups are running - this is normal & in no way should affect any active jobs.

You need to check your NB & system logs as to the reason behind your I/O errors (84) - I would imagine that it is coincidence that the error has happened at the same time as the image cleanup (unless you could prove otherwise? - try the job again when there is no likelihood of a cleanup starting, try the job again & manually instigate a cleanup)

Marianne
Level 6
Partner    VIP    Accredited Certified

You did not mention your O/S or NBU version?
This TN describes a similar situation with duplication jobs, but the symptoms and errors are similar. The problem is possibly not with image cleanup (never seen this at any of our customers, large and small sites) but with Windows memory cache manager.

http://seer.entsupport.symantec.com/docs/295563.htm

error 2060017: system call failed" errors


Cause:
• More files are open than the memory cache manager can handle. As a result, the cache manager has exhausted the available paged pool memory.

• The backup program has tried to back up a file whose size is larger than the backup API can access on that version of the operating system. This has the same result (that is, the paged pool is exhausted).


Solution:
On the Windows 2003 server in question, 2 Registry Keys (PoolUsageMaximum, PagedPoolSize) may be created to fine-tune how the operating system manages Paged Pool Memory.  For complete details on creating and configuring these registry keys, review Microsoft Knowledge Base Article below:
 http://support.microsoft.com/kb/304101/en-us

Andy_Welburn
Level 6
Did you manage to resolve this?