PDDO Duplications using an SLP failing with 'Error 84: media write error'
Hi,
Our PDDO duplications have stopped working since all of the related servers were rebooted over the weekend and are now repoting an error 84.
Setup is:
VCB Snapshot backups taken from VMWare ESX 3.5 to staging area on NBU Media Server (6.5.5)
Snapshots backed up to primary Puredisk server
Backup duplicated both to Tape and secondary Puredisk server using a Storage Lifecycle Policy
The snapshots, backup and duplication to tape all work fine but the duplication to puredisk fails throughout the night. Sometimes they re-run and pass but there are always a few still trying to duplication in the morning which will never complete. Job log below:
03/11/2010 10:17:08 - begin Duplicate
03/11/2010 10:17:13 - Info Duplicate (pid=7260) Initiating optimized duplication from @aaaak to @aaaam
03/11/2010 10:17:07 - requesting resource LCM_ST_DR_PureDisk_Pool
03/11/2010 10:17:07 - granted resource LCM_ST_DR_PureDisk_Pool
03/11/2010 10:17:08 - started process RUNCMD (pid=7260)
03/11/2010 10:17:08 - ended process 0 (pid=7260)
03/11/2010 10:17:08 - requesting resource ST_DR_PureDisk_Pool
03/11/2010 10:17:08 - granted resource MediaID=@aaaam;DiskVolume=PureDiskVolume;DiskPool=DR_PureDisk_Pool;Path=PureDiskVolume;StorageServer=espdisk02s;MediaServer=esnbupd01s
03/11/2010 10:17:08 - granted resource ST_DR_PureDisk_Pool
03/11/2010 10:17:11 - requesting resource @aaaak
03/11/2010 10:17:11 - granted resource MediaID=@aaaak;DiskVolume=PureDiskVolume;DiskPool=CH_PureDisk_Pool;Path=PureDiskVolume;StorageServer=espdisk01s;MediaServer=esnbupd01s
03/11/2010 10:17:14 - started process bpdm (pid=5588)
03/11/2010 10:17:29 - begin writing
03/11/2010 10:28:31 - Critical bpdm (pid=5588) sts_copy_extent failed: error 2060013 no more entries
03/11/2010 10:28:32 - Critical bpdm (pid=5588) image copy failed: error 2060013: no more entries
03/11/2010 10:28:32 - Error bpdm (pid=5588) cannot copy image from disk, bytesCopied = 18446744073709551615
03/11/2010 10:28:33 - Critical bpdm (pid=5588) sts_close_handle failed: 2060022 software error
03/11/2010 10:28:50 - Error bpduplicate (pid=7260) host esnbupd01s backup id essccm02v_1268258206 optimized duplication failed, media write error (84).
03/11/2010 10:28:50 - Error bpduplicate (pid=7260) Duplicate of backupid essccm02v_1268258206 failed, media write error (84).
03/11/2010 10:28:50 - Error bpduplicate (pid=7260) Status = no images were successfully processed.
03/11/2010 10:28:50 - end Duplicate; elapsed time 0:11:42
03/11/2010 10:28:51 - Info esnbupd01s (pid=5588) StorageServer=PureDisk:espdisk01s; Report=PDDO Stats for (espdisk01s): scanned: 6 KB, stream rate: 0.00 MB/sec, CR sent: 2097151 KB, dedup: 0.0%, cache hits: 0 (0.0%)
media write error (84)
Any help/ideas would be awesome. I've been trawling through the bpdm log and done a lot of Googling but found nothing.
Cheers
Rob
My problem was caused by deleting and then recreating a storage pool on 2 of our deduplication media servers.
Apparently the legacy database entries for forwarding the image copies was pointing to the wrong volumes.
With the help of a helpful tech, we deleted the erroneous indexes.
Call Tech Support for a solution as the following I am sure could be dangerous
This is not complete, but the indexes are located here:
/usr/openv/pdde/pdcr/bin/spadb -d /dedupe/databases -c "select id,name from dataselection"
id name
1 System DS for STP
2 PDDO
Corresponds to the following:ls /dedupe/databases/spa/database/dataselection/
1 2
/usr/openv/pdde/pdcr/bin/spadb -d /dedupe/databases -c "select * from forward"
Corresponds to the following:
ls /dedupe/databases/spa/database/forward/
essentially you can remove any additional files from the
/dedupe/databases/spa/database/dataselection/ and
/dedupe/databases/spa/database/forward/EXCEPT for the !!!
id name
1 System DS for STP
2 PDDO
which are the files 1 & 2
This has to be done on all storage servers
Once the services are started again, the indexes are re-created.