cancel
Showing results for 
Search instead for 
Did you mean: 

vdRead failed with error [16000]

ianhoskins
Level 4

Just started getting some errors in VMWare backups after VMs were storage vmotioned to a newly provisioned LUN.
The Datastore LUN is presented to my Media Agents and I have verified that I can see the LUN on the Media Agents.
Not all VMs on the LUN are having backup issues.

Netbackup 7.6.1.2 or RHEL

I get an error 6 in the Activity Log after the snapshot has been created.  The cleanup of the snapshot is successful.
 


10/23/2015 03:55:06 - Info nbjm (pid=22342) starting backup job (jobid=917452) for client NTBIUT01, policy test_blc, schedule Weekly_Full_Wed_1800
10/23/2015 03:55:06 - estimated 0 kbytes needed
10/23/2015 03:55:06 - Info nbjm (pid=22342) started backup (backupid=NTBIUT01_1445586906) job for client NTBIUT01, policy test_blc, schedule Weekly_Full_Wed_1800 on storage unit blc_copy1 using backup host lxnbumablcp01.conseco.bak
10/23/2015 03:55:07 - started process bpbrm (pid=26518)
10/23/2015 03:55:07 - connecting
10/23/2015 03:55:07 - connected; connect time: 0:00:00
10/23/2015 03:55:10 - begin writing
10/23/2015 03:55:49 - end writing; write time: 0:00:39
10/23/2015 03:57:14 - Info bpbrm (pid=26518) NTBIUT01 is the host to backup data from
10/23/2015 03:57:14 - Info bpbrm (pid=26518) reading file list for client
10/23/2015 03:57:14 - Info bpbrm (pid=26518) starting bpbkar on client
10/23/2015 03:57:14 - Info bpbkar (pid=26521) Backup started
10/23/2015 03:57:14 - Info bpbrm (pid=26518) bptm pid: 26522
10/23/2015 03:57:14 - Info bptm (pid=26522) start
10/23/2015 03:57:15 - Info bptm (pid=26522) using 262144 data buffer size
10/23/2015 03:57:15 - Info bptm (pid=26522) using 30 data buffers
10/23/2015 03:57:16 - Info bptm (pid=26522) start backup
10/23/2015 03:57:49 - Critical bpbrm (pid=26518) from client NTBIUT01: FTL - cleanup() failed, status 6

10/23/2015 03:57:51 - Error bptm (pid=26522) media manager terminated by parent process
10/23/2015 03:57:56 - Info bpbkar (pid=0) done. status: 6: the backup failed to back up the requested files
the backup failed to back up the requested files  (6)

We did note this in the logs

EMCPOWER emcpower_parse_devname        : /nbcm_builds/NB/7.6.1.2/src/vxms/plugin/platforms/linux/osdep_emcpower.c.70 <INFO> : emcpower_parse_devname: is returning with ERROR
   10/22/2015 23:55:50 : vdRead:VixInterface.cpp:693 <ERROR> : Error 17592452332404352 in read with disk handle 46659552 startSector 0 numSectors 1
   10/22/2015 23:55:51 : readFlatSector:VixGuest.cpp:2500 <ERROR> : vdRead failed with error [16000] for [[vmd-pi-526] NTBIUT01/NTBIUT01.vmdk]. Issuing retry 1 of 3
   10/22/2015 23:55:55 : vdRead:VixInterface.cpp:693 <ERROR> : Error 17592452332404352 in read with disk handle 47317568 startSector 0 numSectors 1
   10/22/2015 23:55:56 : readFlatSector:VixGuest.cpp:2500 <ERROR> : vdRead failed with error [16000] for [[vmd-pi-526] NTBIUT01/NTBIUT01.vmdk]. Issuing retry 2 of 3
   10/22/2015 23:56:00 : vdRead:VixInterface.cpp:693 <ERROR> : Error 17592452332404352 in read with disk handle 49823232 startSector 0 numSectors 1
   10/22/2015 23:56:01 : readFlatSector:VixGuest.cpp:2500 <ERROR> : vdRead failed with error [16000] for [[vmd-pi-526] NTBIUT01/NTBIUT01.vmdk]. Issuing retry 3 of 3
   10/22/2015 23:56:05 : vdRead:VixInterface.cpp:693 <ERROR> : Error 17592452332404352 in read with disk handle 47381456 startSector 0 numSectors 1
   10/22/2015 23:56:05 : readFlatSector:VixGuest.cpp:2506 <ERROR> : Exiting with error in reading sector 0x0000000000000000 with sector run 0x0000000000000001 from disk [vmd-pi-526] NTBIUT01/NTBIUT01.vmdk
   10/22/2015 23:56:05 : resolvePartitions:VixGuest.cpp:2173 <ERROR> : Exited with VIX_PLG_EVDISKREAD
   10/22/2015 23:56:05 : resolvePartitions:VixGuest.cpp:2174 <ERROR> : Returning: 1026
   10/22/2015 23:56:05 : vixMapObjCtl:VixCoordinator.cpp:976 <ERROR> : Returning: 1026
   10/22/2015 23:56:05 : vix_map_objctl:libvix.cpp:1206

Ran across this article https://www.veritas.com/support/en_US/article.TECH167193 but it deals more with an error code 13, not a 6.

 

Thoughts?

2 REPLIES 2

ianhoskins
Level 4

OK, so after seeing the error "mcpower_parse_devname: is returning with ERROR" I deciced to unmap the LUN from the Media Agents and rescan the scsi bus.  I then mapped the LUN back to the Media Agents as a new LUN ID and did a rescan.  This changed the LUN to be at /dev/sg140 instead of /dev/sg69.

I restarted the backup job and it worked.

So, I unmapped the LUN again, rescaned, and mapped back to the original LUN ID.  Rescanned the hosts and it came in on /dev/sg69.  Re-ran a backup and it failed.
 

As of right now, the LUN is on the NEW LUN ID and /dev/sg140 and all is working fine.

I can't make since of what the issue could have been?

Thoughts?

 

Ian

 

ianhoskins
Level 4

Just stumpled across some more information.

The half of the jobs that were successful were on the "02" media agent.  The other half were on the "01" media agent.

All my testing I did above was done on VMs that were backing up on the "01" media agent.

It definatly looks like there was some kind of SCSI issue on this box.