cancel
Showing results for 
Search instead for 
Did you mean: 

Processing full disk condition - job showing active but not resuming backup

JohnOBrien
Level 3
Partner

Hi There,

I have encoutered something this morning on our Netbackup 7.1.0.3 ENV.

Setup
Windows 2003

Master server
Media server with staging disk
Media server with staging disk and library w LTO4 *4

3 SU
SU1 EMC Celerra staging disk (advanced)    HW 98%  LW 70%
SU2 EMC Celerra staging disk (advanced)    HW 98%  LW 70%
SU3 library
1 Advanced Disk SUG with 1 & 2

The client read timeout is set to 900

 

Some jobs last night encountered a full disk condition last night which is common however they still have not resumed 12 hours later.

So this morning I changed the HW to 70% and LW to 60% to force it to clear down, then back to HW  98% LW 90%.  It got down to 75% on both

The jobs still will not restart automatically and are are showing as Active I cannot suspend/resume/restart

As I understand it if a job hits a full disk it will wait for the disk to reach low watermark before retrying, it will then go again until it completes or hits full processing again.

The reason for the LW @ 70% is we have large backups that have produced 129 Disk full errors in the past.

Can I do anything to force start these jobs?

 

Heres the bperror and bpdbjobs output for the job:

 

C:\Program Files\Veritas\NetBackup\bin>bpdbjobs -all_columns -jobid 390342390342,0,1,,_Netbackup_Std_Srv_2003,Differential-Inc,CLIENT0044,MEDIA_1,1391126518,0000042763,0000000000,MEDIA_1_SU_1,1,3,2932480,175,/C/WINDOWS/system32/,68,3276,root,1,13,1,0,root,inbdub0006,2,1,0,0,314,5,SET SNAP_ID=CLIENT0044_1391126454,Shadow Copy Components:\\,c:\\,d:\\,,1,3276,MEDIA_1_SU_1,MEDIA_1,1391126518,0000042763,0000000000,,,21,1/31/2014 12:01:58 AM - Info nbjm(pid=4056) starting backup job (jobid=390342) for client CLIENT0044\, policy _Netbackup_Std_Srv_2003\, schedule Differential-Inc,1/31/2014 12:01:58 AM - estimated 4293765 kbytes needed,1/31/2014 12:01:58 AM - Info nbjm(pid=4056) started backup job for client CLIENT0044\, policy _Netbackup_Std_Srv_2003\, schedule Differential-Inc on storage unit MEDIA_1_SU_1,1/31/2014 12:02:04 AM - Info bpbrm(pid=3276) CLIENT0044 is the host to backup data from,1/31/2014 12:02:05 AM - Info bpbrm(pid=3276) reading file listfrom client,1/31/2014 12:02:02 AM - started process bpbrm (3276,1/31/2014 12:02:07 AM - connecting,1/31/2014 12:02:11 AM - Info bpbrm(pid=3276) starting bpbkar32 on client,1/31/2014 12:02:13 AM - Info bpbkar32(pid=0) Backup started,1/31/2014 12:02:14 AM - Info bptm(pid=2508) start,1/31/2014 12:02:14 AM - Info bptm(pid=2508) using 262144 data buffer size,1/31/2014 12:02:14 AM - Info bptm(pid=2508) setting receive network buffer to 1049600 bytes,1/31/2014 12:02:14 AM - Info bptm(pid=2508) using 30 data buffers,1/31/2014 12:02:16 AM - Info bptm(pid=2508) start backup,1/31/2014 12:02:19 AM - Info bptm(pid=2508) backup child process is pid 7704.3752,1/31/2014 12:02:19 AM - Info bptm(pid=7704) start,1/31/2014 12:02:11 AM - connected; connect time: 000:00:04,1/31/2014 12:02:19 AM - begin writing,1/31/2014 12:12:07 AM - Warning bpbrm(pid=3276) from client CLIENT0044: WRN - can't open file: C:\\WINDOWS\\system32\\wins\\winstmp.mdb (WIN32 32: The process cannot access the file because it is being used by another process. ),1/31/2014 12:12:11 AM - Info bpbrm(pid=3276) from client CENT0044: TRV - Skipping path on network file system: D:,1/31/2014 12:41:09 AM - Warning bptm(pid=2508) disk pool MEDIA_1_Disk_Pool_1 volume F:\\ is full: processing disk full condition,2932480,175,390296,1264,,,,,,,,,,,1,0,0,1,0,0,CLIENT0044_1391126518,1,,,0, ,
 
C:\Program Files\Veritas\NetBackup\bin>bperror -jobid 3903421391126518 1 4 4 MEDIA_1 390342 390296 0 CLIENT0044 nbjm started backup job for client CLIENT0044, policy _Netbackup_Std_Srv_2003, schedule Differential-Inc on storage unit MEDIA_1_SU_11391126521 1 4 4 MEDIA_1 390342 390296 0 CLIENT0044 nbjm client CLIENT0044 handling path SET SNAP_ID=CLIENT0044_13911264541391126521 1 4 4 MEDIA_1 390342 390296 0 CLIENT0044 nbjm client CLIENT0044 handling path Shadow Copy Components:\1391126521 1 4 4 MEDIA_1 390342 390296 0 CLIENT0044 nbjm client CLIENT0044 handling path c:\1391126521 1 4 4 MEDIA_1 390342 390296 0 CLIENT0044 nbjm client CLIENT0044 handling path d:\1391126539 1 33412 4 MEDIA_1 390342 390296 0 CLIENT0044 bptm begin writing backup id CLIENT0044_1391126518, copy 1, fragment 1, destination path F:\1391127128 1 4 8 MEDIA_1 390342 390296 0 CLIENT0044 bpbrm from client CLIENT0044: WRN - can't open file: C:\WINDOWS\system32\wins\winstmp.mdb (WIN3232: The process cannot access the file because it is being used y another process. )1391127194 1 4 4 MEDIA_1 390342 390296 0 CLIENT0044 bpbrm from client CLIENT0044: TRV - Skipping path on network file system: D:1391128870 1 33412 8 MEDIA_1 390342 390296 0 CLIENT0044 bptm disk pool MEDIA_1_Disk_Pool_1 volume F:\ is full: processing disk full condition

 

 

Any Help appreciated.

 

John

1 ACCEPTED SOLUTION

Accepted Solutions

SymTerry
Level 6
Employee Accredited

NetBackup doc HOWTO90550 talks about error 129. Space is going to be an issue here. I would look into making sure the staged images are removed when the jobs are complete. TECH44719 and TECH66149 talk about netbackups behaviour for Relocation and  for Cleanup of staged images.

NetBackup will automatically start to cleanup images that have been successfully duplicated to tape and retry the backup. If it doesn't, something might be hung. Kill the job and try it again making sure you have enough staging space .

View solution in original post

1 REPLY 1

SymTerry
Level 6
Employee Accredited

NetBackup doc HOWTO90550 talks about error 129. Space is going to be an issue here. I would look into making sure the staged images are removed when the jobs are complete. TECH44719 and TECH66149 talk about netbackups behaviour for Relocation and  for Cleanup of staged images.

NetBackup will automatically start to cleanup images that have been successfully duplicated to tape and retry the backup. If it doesn't, something might be hung. Kill the job and try it again making sure you have enough staging space .