cancel
Showing results for 
Search instead for 
Did you mean: 

Error code 13 for DB2 backup

tarmizi
Level 5
Partner Accredited Certified

Hello all, 

I got a question here. There is a DB2 backup that running as online backup. Previously, the backup will take up to 9 hour to finish. So what we did is, we add the @set db2_sessions=OPEN 2 SESSIONS WITH 4 BUFFERS BUFFER 1024 in the script (as like shown in the sample) to shorten the backup time. But what happen is confusing where the backup gave error code 13 for jobs but at the parent job the job status is 0. And it showing that the backup fiinsh in 2 hours and also able to list the DB2 backup image using command "bplist -C <client> -S <master_server> -t 18 -R/". 

So is it a successful backup or not. Is this some kind of bug?

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified
The jobs failed with a client read timeout. As this timeout is already at 1 hour (3600), you need to check logs to see why client did not seem to send through any data in 1 hour. Logs needed on media server are bptm and bpbrm at log level of 3 or higher. Not too sure about logs on client as I don't have access to db2 agent guide right now. You may want to download the manual (links in Handy NBU Links in my signature ) and look in Troubleshooting chapter for log folders that need to be created on db2 client.

View solution in original post

4 REPLIES 4

Marianne
Level 6
Partner    VIP    Accredited Certified
You have a number of Default Application backup child streams that complete with status 0. Those are the ones that will show up in bplist command. 1st step to troubleshoot is to look at Details tab of failed jobs. Please copy all text and post here. This should give us an indication of logs that will be needed for further troubleshooting.

tarmizi
Level 5
Partner Accredited Certified

Hello Marianne,

As per requested, This is the first failed jobs detail status:

7/1/2015 9:48:56 PM - Info bpbrm(pid=26375) msr3a153 is the host to backup data from     
7/1/2015 9:48:56 PM - Info bpbrm(pid=26375) reading file list for client        
7/1/2015 9:48:57 PM - Info bpbrm(pid=26375) listening for client connection         
7/1/2015 9:48:58 PM - Info bpbrm(pid=26375) INF - Client read timeout = 3600      
7/1/2015 9:48:59 PM - Info bpbrm(pid=26375) accepted connection from client         
7/1/2015 9:48:59 PM - Info dbclient(pid=6568) Backup started           
7/1/2015 9:48:59 PM - Info bpbrm(pid=26375) bptm pid: 26404          
7/1/2015 9:48:59 PM - Info bptm(pid=26404) start            
7/1/2015 9:48:59 PM - Info bptm(pid=26404) using 524288 data buffer size        
7/1/2015 9:48:59 PM - Info bptm(pid=26404) using 128 data buffers         
7/1/2015 9:49:00 PM - Info bptm(pid=26404) start backup           
7/1/2015 9:49:03 PM - Info bptm(pid=26404) backup child process is pid 26413       
7/1/2015 9:49:04 PM - Info dbclient(pid=6568) dbclient(pid=6568) wrote first buffer(size=4096)         
7/1/2015 10:00:49 PM - Info nbjm(pid=5172) starting backup job (jobid=1004916) for client msr3a153, policy IFAS_DB_ONLINE, schedule Default-Application-Backup  
7/1/2015 10:00:49 PM - Info nbjm(pid=5172) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1004916, request id:{A6DB828A-F53E-496A-B4FC-9FBCCE3304A4})  
7/1/2015 10:00:49 PM - requesting resource stu_disk_mssba572
7/1/2015 10:00:49 PM - requesting resource mssba565.NBU_CLIENT.MAXJOBS.msr3a153
7/1/2015 10:00:49 PM - requesting resource mssba565.NBU_POLICY.MAXJOBS.IFAS_DB_ONLINE
7/1/2015 10:00:49 PM - granted resource mssba565.NBU_CLIENT.MAXJOBS.msr3a153
7/1/2015 10:00:49 PM - granted resource mssba565.NBU_POLICY.MAXJOBS.IFAS_DB_ONLINE
7/1/2015 10:00:49 PM - granted resource MediaID=@aaaak;DiskVolume=PureDiskVolume;DiskPool=dp_disk_mssba572;Path=PureDiskVolume;StorageServer=mssba572;MediaServer=mssba572
7/1/2015 10:00:49 PM - granted resource stu_disk_mssba572
7/1/2015 10:00:50 PM - estimated 0 Kbytes needed
7/1/2015 10:00:50 PM - Info nbjm(pid=5172) started backup (backupid=msr3a153_1435759249) job for client msr3a153, policy IFAS_DB_ONLINE, schedule Default-Application-Backup on storage unit stu_disk_mssba572
7/1/2015 10:00:51 PM - started process bpbrm (26375)
7/1/2015 10:00:51 PM - connecting
7/1/2015 10:00:54 PM - connected; connect time: 0:00:03
7/1/2015 10:00:58 PM - begin writing
7/2/2015 4:47:22 AM - Error bpbrm(pid=26375) socket read failed: errno = 62 - Timer expired    
7/2/2015 4:47:24 AM - Error bptm(pid=26404) media manager terminated by parent process       
7/2/2015 4:47:44 AM - Info mssba572(pid=26404) StorageServer=PureDisk:mssba572; Report=PDDO Stats for (mssba572): scanned: 206655495 KB, CR sent: 10750819 KB, CR sent over FC: 0 KB, dedup: 94.8%, cache hits: 0 (0.0%)
7/2/2015 4:47:44 AM - Info dbclient(pid=6568) done. status: 13: file read failed       
7/2/2015 4:59:43 AM - end writing; write time: 6:58:45
file read failed(13)

And this is the second job failed status:

7/1/2015 9:49:12 PM - Info bpbrm(pid=26440) msr3a153 is the host to backup data from     
7/1/2015 9:49:12 PM - Info bpbrm(pid=26440) reading file list for client        
7/1/2015 9:49:13 PM - Info bpbrm(pid=26440) listening for client connection         
7/1/2015 9:49:14 PM - Info bpbrm(pid=26440) INF - Client read timeout = 3600      
7/1/2015 9:49:14 PM - Info bpbrm(pid=26440) accepted connection from client         
7/1/2015 9:49:14 PM - Info dbclient(pid=6568) Backup started           
7/1/2015 9:49:14 PM - Info bpbrm(pid=26440) bptm pid: 26466          
7/1/2015 9:49:15 PM - Info bptm(pid=26466) start            
7/1/2015 9:49:15 PM - Info bptm(pid=26466) using 524288 data buffer size        
7/1/2015 9:49:15 PM - Info bptm(pid=26466) using 128 data buffers         
7/1/2015 9:49:16 PM - Info bptm(pid=26466) start backup           
7/1/2015 9:49:17 PM - Info bptm(pid=26466) backup child process is pid 26514       
7/1/2015 9:49:18 PM - Info dbclient(pid=6568) dbclient(pid=6568) wrote first buffer(size=4096)         
7/1/2015 10:01:05 PM - Info nbjm(pid=5172) starting backup job (jobid=1004919) for client msr3a153, policy IFAS_DB_ONLINE, schedule Default-Application-Backup  
7/1/2015 10:01:05 PM - Info nbjm(pid=5172) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1004919, request id:{85E7249F-A943-4F04-912F-4794C6A9BF4E})  
7/1/2015 10:01:05 PM - requesting resource stu_disk_mssba572
7/1/2015 10:01:05 PM - requesting resource mssba565.NBU_CLIENT.MAXJOBS.msr3a153
7/1/2015 10:01:05 PM - requesting resource mssba565.NBU_POLICY.MAXJOBS.IFAS_DB_ONLINE
7/1/2015 10:01:05 PM - granted resource mssba565.NBU_CLIENT.MAXJOBS.msr3a153
7/1/2015 10:01:05 PM - granted resource mssba565.NBU_POLICY.MAXJOBS.IFAS_DB_ONLINE
7/1/2015 10:01:05 PM - granted resource MediaID=@aaaak;DiskVolume=PureDiskVolume;DiskPool=dp_disk_mssba572;Path=PureDiskVolume;StorageServer=mssba572;MediaServer=mssba572
7/1/2015 10:01:05 PM - granted resource stu_disk_mssba572
7/1/2015 10:01:06 PM - estimated 0 Kbytes needed
7/1/2015 10:01:06 PM - Info nbjm(pid=5172) started backup (backupid=msr3a153_1435759265) job for client msr3a153, policy IFAS_DB_ONLINE, schedule Default-Application-Backup on storage unit stu_disk_mssba572
7/1/2015 10:01:07 PM - started process bpbrm (26440)
7/1/2015 10:01:07 PM - connecting
7/1/2015 10:01:10 PM - connected; connect time: 0:00:03
7/1/2015 10:01:13 PM - begin writing
7/2/2015 4:47:21 AM - Error bpbrm(pid=26440) socket read failed: errno = 62 - Timer expired    
7/2/2015 4:47:24 AM - Error bptm(pid=26466) media manager terminated by parent process       
7/2/2015 4:48:01 AM - Info mssba572(pid=26466) StorageServer=PureDisk:mssba572; Report=PDDO Stats (multi-threaded stream used) for (mssba572): scanned: 232226823 KB, CR sent: 11598545 KB, CR sent over FC: 0 KB, dedup: 95.0%, cache hits: 0 (0.0%)
7/2/2015 4:48:01 AM - Info dbclient(pid=6568) done. status: 13: file read failed       
7/2/2015 5:00:00 AM - end writing; write time: 6:58:47
file read failed(13)

tarmizi
Level 5
Partner Accredited Certified

And some additional info, it look like the job created two backup job with:

1) /DB2/PR0/node0000/20150701215755/PR0.0.DB2PR0.node0000.2571.20150701215755.1

2) /DB2/PR0/node0000/20150701215755/PR0.0.DB2PR0.node0000.2571.20150701215755.2

Is this because of we add @set db2_sessions=OPEN 2 SESSIONS WITH 4 BUFFERS BUFFER 1024 in the script?

Marianne
Level 6
Partner    VIP    Accredited Certified
The jobs failed with a client read timeout. As this timeout is already at 1 hour (3600), you need to check logs to see why client did not seem to send through any data in 1 hour. Logs needed on media server are bptm and bpbrm at log level of 3 or higher. Not too sure about logs on client as I don't have access to db2 agent guide right now. You may want to download the manual (links in Handy NBU Links in my signature ) and look in Troubleshooting chapter for log folders that need to be created on db2 client.