Forum Discussion

elanmbx's avatar
elanmbx
Level 6
10 years ago

Status 42 on VMware backups (somewhat inconsistent)

Solaris Master - 7.6.0.3

3 x 5220 appliance media servers - 2.6.0.3

Appliances are the VMware Backup Hosts

All VMware backups are done via SAN transport mode

Job details from a "representative" Status 42 failure:

11/13/2014 12:06:09 - Info nbjm (pid=19785) starting backup job (jobid=8025958) for client denjsit15, policy RETRY-VMware-DEV, schedule Full
11/13/2014 12:06:09 - estimated 0 kbytes needed
11/13/2014 12:06:09 - Info nbjm (pid=19785) started backup (backupid=denjsit15_1415905569) job for client denjsit15, policy RETRY-VMware-DEV, schedule Full on storage unit stu_disk_densyma02p
11/13/2014 12:06:10 - started process bpbrm (pid=12111)
11/13/2014 12:06:11 - Info bpbrm (pid=12111) denjsit15 is the host to backup data from
11/13/2014 12:06:11 - Info bpbrm (pid=12111) reading file list for client
11/13/2014 12:06:11 - Info bpbrm (pid=12111) starting bpbkar on client
11/13/2014 12:06:11 - Info bpbkar (pid=12146) Backup started
11/13/2014 12:06:11 - connecting
11/13/2014 12:06:11 - connected; connect time: 0:00:00
11/13/2014 12:06:12 - Info bpbrm (pid=12111) bptm pid: 12147
11/13/2014 12:06:12 - Info bptm (pid=12147) start
11/13/2014 12:06:15 - Info bptm (pid=12147) using 524288 data buffer size
11/13/2014 12:06:15 - Info bptm (pid=12147) setting receive network buffer to 262144 bytes
11/13/2014 12:06:15 - Info bptm (pid=12147) using 128 data buffers
11/13/2014 12:06:17 - Info bptm (pid=12147) start backup
11/13/2014 12:06:26 - begin writing
11/13/2014 12:07:24 - Info bpbkar (pid=12146) 0 entries sent to bpdbm
11/13/2014 12:07:24 - Info bpbkar (pid=12146) 90 entries sent to bpdbm
11/13/2014 12:07:24 - Info bpbkar (pid=12146) 91 entries sent to bpdbm
11/13/2014 12:07:31 - Info bpbkar (pid=12146) 95091 entries sent to bpdbm
11/13/2014 12:07:40 - Info bpbkar (pid=12146) 190091 entries sent to bpdbm
11/13/2014 12:07:47 - Info bpbkar (pid=12146) 272387 entries sent to bpdbm
11/13/2014 12:07:47 - Info bpbkar (pid=12146) 272388 entries sent to bpdbm
11/13/2014 12:07:50 - Info bpbkar (pid=12146) 287172 entries sent to bpdbm
11/13/2014 12:07:50 - Info bpbkar (pid=12146) 287173 entries sent to bpdbm
11/13/2014 12:07:50 - Info bpbkar (pid=12146) 287232 entries sent to bpdbm
11/13/2014 12:07:50 - Info bpbkar (pid=12146) 287233 entries sent to bpdbm
11/13/2014 12:07:58 - Info bpbkar (pid=12146) 382233 entries sent to bpdbm
11/13/2014 12:08:06 - Info bpbkar (pid=12146) 477233 entries sent to bpdbm
11/13/2014 12:08:14 - Info bpbkar (pid=12146) 558340 entries sent to bpdbm
11/13/2014 12:08:14 - Info bpbkar (pid=12146) 558341 entries sent to bpdbm
11/13/2014 12:08:14 - Info bpbkar (pid=12146) 558348 entries sent to bpdbm
11/13/2014 12:08:14 - Info bpbkar (pid=12146) INF - Transport Type =  san
11/13/2014 12:08:14 - Info bpbkar (pid=12146) 558381 entries sent to bpdbm
11/13/2014 12:09:39 - Info bpbkar (pid=12146) 558382 entries sent to bpdbm
11/13/2014 12:13:20 - Info bpbkar (pid=12146) bpbkar waited 10177 times for empty buffer, delayed 29247 times
11/13/2014 12:13:20 - Info bptm (pid=12147) waited for full buffer 6491 times, delayed 17216 times
11/13/2014 12:22:35 - Error bptm (pid=12147) get_string() failed, Broken pipe (32), premature end of file encountered
11/13/2014 12:22:35 - Info bptm (pid=12147) EXITING with status 42 <----------
11/13/2014 12:22:38 - Info densyma02p (pid=12147) StorageServer=PureDisk:densyma02p; Report=PDDO Stats (multi-threaded stream used) for (densyma02p): scanned: 23198447 KB, CR sent: 31511 KB, CR sent over FC: 0 KB, dedup: 99.9%, cache hits: 260943 (99.5%)
11/13/2014 12:22:40 - Info bpbkar (pid=0) done
11/13/2014 12:22:40 - Info bpbkar (pid=0) done. status: 42: network read failed
11/13/2014 12:22:40 - end writing; write time: 0:16:14
network read failed  (42)

I have attached the associated bpbrm, bpbkar, and bptm (I only have included the last 500 lines of the bptm log entries for pid=12147 since on VERBOSE=5 bptm was HUGE for this job).

Any help would be greatly appreciated!

  • I deleted all the old files and then retried the backup that was recently failing.  And it worked.  I'm going to be cautiously optimistic that perhaps my "Status 42" issues will be less prevalent with this discovery.

    Anyone know why these files get left behind?  The backup I retried left behind the following files:

    densyma02p:/usr/openv/netbackup/online_util/fi_cntl # ll
    total 21744
    -rw-r--r-- 1 root root    2473 Nov 20 14:58 bpfis.fim.<host>_1416520694.1.0
    -rw-r--r-- 1 root root    2058 Nov 20 14:58 bpfis.fim.<host>_1416520694.1.0.NBU_DATA.xml
    -rw-r--r-- 1 root root      21 Nov 20 15:05 bpfis.fim.<host>_1416520694.1.0.NBU_DATA.xml.BID
    -rw-r--r-- 1 root root   22406 Nov 20 14:58 bpfis.fim.<host>_1416520694.1.0.VM_ObjInfoXML.xml
    -rw-r--r-- 1 root root     734 Nov 20 14:58 bpfis.fim.<host>_1416520694.1.0.changeid.xml
    -rw-rw-rw- 1 root root     284 Nov 20 15:10 <host>_1416520694_copy1.lock

    I'm inclined to stick a cron job on the appliance to clean up this directory daily, but would like to avoid such a modification if I can manage to.  I really dislike doing these types of "one off" fixes on the appliances.

13 Replies