10-19-2015 11:25 PM
Hi everyone,
We've been experiencing backups with our VMWare lately with intermittent status 84, 83, 87 and 2074. Started checking from the MSDP status but its always up and not going down. We've followed some workarounds namely:
Information:
Below is the detailed status of one of the jobs.
10/20/2015 2:12:18 PM - Info bpbrm(pid=4260) 132_147_160_082_ZMUNCLMSTST_PCPPI_COM is the host to backup data from
10/20/2015 2:12:18 PM - Info bpbrm(pid=4260) reading file list for client
10/20/2015 2:12:18 PM - Info bpbrm(pid=4260) accelerator enabled
10/20/2015 2:12:19 PM - Info bpbrm(pid=4260) There is no complete backup image match with track journal, a regular full backup will be performed.
10/20/2015 2:12:19 PM - Info bpbrm(pid=4260) starting bpbkar32 on client
10/20/2015 2:12:20 PM - Info bpbkar32(pid=6508) Backup started
10/20/2015 2:12:21 PM - Info bpbkar32(pid=6508) accelerator enabled backup, archive bit processing:<disabled>
10/20/2015 2:12:23 PM - Info bptm(pid=4196) start
10/20/2015 2:12:25 PM - Info bptm(pid=4196) using 262144 data buffer size
10/20/2015 2:12:25 PM - Info bptm(pid=4196) setting receive network buffer to 1049600 bytes
10/20/2015 2:12:25 PM - Info bptm(pid=4196) using 30 data buffers
10/20/2015 2:12:31 PM - Info bptm(pid=4196) start backup
10/20/2015 2:12:43 PM - Info bptm(pid=4196) backup child process is pid 4620.4632
10/20/2015 2:12:43 PM - Info bptm(pid=4620) start
10/20/2015 2:13:02 PM - Info bpbkar32(pid=6508) 0 entries sent to bpdbm
10/20/2015 2:13:13 PM - Info bpbkar32(pid=6508) 95000 entries sent to bpdbm
10/20/2015 2:13:21 PM - Info bpbkar32(pid=6508) 146631 entries sent to bpdbm
10/20/2015 2:13:21 PM - Info bpbkar32(pid=6508) 146632 entries sent to bpdbm
10/20/2015 2:13:39 PM - Info nbjm(pid=4736) starting backup job (jobid=51853) for client 132_147_160_082_ZMUNCLMSTST_PCPPI_COM, policy zmunCLMStst_vmware_backup, schedule Monthly_Full_Backup
10/20/2015 2:13:39 PM - estimated 0 Kbytes needed
10/20/2015 2:13:39 PM - Info nbjm(pid=4736) started backup (backupid=132_147_160_082_ZMUNCLMSTST_PCPPI_COM_1445375619) job for client 132_147_160_082_ZMUNCLMSTST_PCPPI_COM, policy zmunCLMStst_vmware_backup, schedule Monthly_Full_Backup on storage unit znbumedia1_dedupe_disk_pool-stu using backup host znbumaster1
10/20/2015 2:13:40 PM - started process bpbrm (4260)
10/20/2015 2:13:42 PM - connecting
10/20/2015 2:13:43 PM - connected; connect time: 0:00:01
10/20/2015 2:13:47 PM - Info bpbkar32(pid=6508) 192623 entries sent to bpdbm
10/20/2015 2:13:47 PM - Info bpbkar32(pid=6508) INF - Transport Type = nbd
10/20/2015 2:14:07 PM - begin writing
10/20/2015 2:14:15 PM - Critical bptm(pid=4196) Storage Server Error: (Storage server: PureDisk:znbumedia1) pdvfs_io_pwrite: Failed to write to spoold on storage server (invalid argument). Ensure storage server services are running and operational. V-454-10
10/20/2015 2:14:15 PM - Critical bptm(pid=4196) image write failed: error 2060001: one or more invalid arguments
10/20/2015 2:14:48 PM - Critical bptm(pid=4196) Storage Server Error: (Storage server: PureDisk:znbumedia1) _pdvfs_fcache_flushfile_norm_cleanup: Failed to write to spoold on storage server (invalid stream descriptor). Ensure storage server services are running and operational. V-454-8
10/20/2015 2:14:49 PM - Critical bptm(pid=4196) sts_close_handle failed: 2060017 system call failed
10/20/2015 2:15:01 PM - Error bptm(pid=4196) cannot write image to disk, Invalid argument
10/20/2015 2:15:01 PM - Info bptm(pid=4196) EXITING with status 84 <----------
10/20/2015 2:15:01 PM - Info znbumedia1(pid=4196) StorageServer=PureDisk:znbumedia1; Report=PDDO Stats for (znbumedia1): scanned: 361864 KB, CR sent: 14430 KB, CR sent over FC: 0 KB, dedup: 96.0%, cache disabled
10/20/2015 2:15:05 PM - Info bpbkar32(pid=0) done. status: 84: media write error
10/20/2015 2:16:28 PM - end writing; write time: 0:02:21
media write error (84)
Hoping you could help us with this.
Thanks,
Mariz
10-20-2015 11:02 AM
Please attach verbose bptm log
10-20-2015 09:05 PM
Hi Riaan,
I'm not sure I could attach it in here as the file is around 9GB. Is there other way to upload or any info from the log that you would like me to find? Thanks!
10-20-2015 09:24 PM
Hi,
Just post some of it so we can see what happened around 2:10 - 2:15 PM.
10-21-2015 12:00 AM
Part 1 of bptm log
10-21-2015 12:00 AM
Part 2 of bptm log
10-21-2015 12:14 AM
Hi,
Your log has a 5 hours gap in it. Maybe something is going wrong on your media server?
02:12:15.995 [1836.4272] <2> 51839:bptm:1836:znbumedia1: [DEBUG] PDSTS: write_multi_buf: buf(0000000002270000) buflen(262144) bytes(0) map_imglength(94720) imh_cur_filesize(1708032) actual_size(1708032)
07:22:15.012 [4300.1456] <2> 51843:bptm:4300:znbumedia1: [DEBUG] PDSTS: pi_write_image_v7: entry
10-21-2015 12:24 AM
Check Event Viewer logs on the media server to see if anything was logged in this period.
10-21-2015 12:46 AM
Hi Riaan,
We had a disk failure last month but was fixed, but once it was fixed, we started receiving status 84, 83, 636 and 2074 backups. Last week, we had the disk replaced, but we are still receiving these errors. I've already increased client read timeouts and added touch files, I'm not sure what the cause of the delay though...
10-21-2015 12:48 AM
Hi Marianne,
We haven't received any error in the Even Viewer since we had the disk replaced (Thursday last week). Though, I'm seeing these errors in DNS in the bptm logs. Please see attachment. Thanks.
10-21-2015 12:52 AM
Any chance you can post logs as .txt files?
.docx needs to be downloaded before it can be opened.
.txt files can be viewed online.
10-21-2015 01:01 AM
I think you've lost some data, there are a lot of errors such as this "failed (2:No such file or directory)"
10-21-2015 01:12 AM
Hi Marianne,
Please see attached .txt file. Thanks.
10-21-2015 01:13 AM
Hi Riaan,
I'm not sure how to check though. This has been looked into by several support engineers already and none hasn't gotten back to me yet on the results. I'm not sure if they were able to capture the bptm logs before, as they were too big too...
10-21-2015 01:21 AM
There are also network errors
07:54:50.038 [4300.1456] <8> 51843:bptm:4300:znbumedia1: [WARN] PDSTS: impl_set_imh_image_prop: PdvfsUnlink(/znbumedia1#1/2/zmunmbx01.pcppi.com/zmunmbx01_vmware_backup/zmunmbx01.pcppi.com_1445350631_C1_F2.info) failed (2:No such file or directory)
07:55:24.288 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: write_buf: pd_write(0 812384256) failed (10054:An existing connection was forcibly closed by the remote host. ) for img
07:55:24.288 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: write_multi_buf: write_buf() failed (2060019:error occurred on network socket) with imh_status (0x1) for actual size
07:55:24.288 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: write_special_buf: write_buf failed: (2060019:error occurred on network socket). bytes: 0 status: 0x1
07:55:24.288 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: write_unknown_len_buf: write_special_buf() failed, byteswritten:<0>, status:<0x1> (2060019:error occurred on network socket)
07:55:24.288 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: impl_write_image: write_unknown_len_buf() failed (2060019:error occurred on network socket)
07:55:24.288 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: pi_write_image_v7: impl_write_image() failed (2060019:error occurred on network socket)
07:55:24.476 [4300.1456] <32> do_write_image: sts_write_image failed: byteswritten = 0, len = 262144
07:55:24.476 [4300.1456] <32> do_write_image: sts_write_image failed: error 2060019 (812384256 262144 0)
07:55:24.491 [4300.1456] <2> set_job_details: Tfile (51843): LOG 1445352924 32 bptm 4300 image write failed: error 2060019: error occurred on network socket
07:55:24.491 [4300.1456] <2> send_job_file: job ID 51843, ftype = 3 msg len = 96, msg = LOG 1445352924 32 bptm 4300 image write failed: error 2060019: error occurred on network socket
07:55:24.648 [4300.1456] <32> write_data: image write failed: error 2060019:
07:55:29.215 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: close_image: PdvfsClose(/znbumedia1#1/2/zmunmbx01.pcppi.com/zmunmbx01_vmware_backup/zmunmbx01.pcppi.com_1445350631_C1_F2.img) failed (5:Input/output error) for image fd
07:55:31.106 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: impl_close_image: close_image() failed (2060017:system call failed)
07:55:31.653 [4300.1456] <2> 51843:bptm:4300:znbumedia1: [DEBUG] PDSTS: impl_close_image: exit (2060017:system call failed) 0x4
07:55:31.653 [4300.1456] <16> 51843:bptm:4300:znbumedia1: [ERROR] PDSTS: pi_close_image_v7: impl_close_image() failed (2060017:system call failed)
07:55:31.653 [4300.1456] <2> 51843:bptm:4300:znbumedia1: [DEBUG] PDSTS: pi_close_image_v7: exit (2060017:system call failed)
07:55:31.793 [4300.1456] <2> set_job_details: Tfile (51843): LOG 1445352931 32 bptm 4300 sts_close_handle failed: 2060017 system call failed
10-21-2015 09:38 PM
Hi Riaan,
Currently, we have 2 media servers. The other one backs up clearly fine, but this one seems unstable. When it comes to the network, could you suggest what actions we can take? We tried continuous ping and connection is stable.
Thanks!
10-21-2015 11:07 PM
Hi,
Sorry i should have said networking related error, not necessarily the network. Software could be cuasing the ports to be closed etc.
As you've indicated that you had a hardware failure on this server you should maybe think about reinstalling it after you duplicate the images you have on it.
Have you tried restores or duplication of any images from this pool since the failure?
10-23-2015 02:43 AM
Hi Riaan,
We only did MSDP reconfiguration for this. We have done replication of images and restoration to the DR site though of some VMs...
11-17-2016 01:48 PM