cancel
Showing results for 
Search instead for 
Did you mean: 

VM snapshot backup is failing with socket read failed, An existing connection was forcibly closed

raeedi239
Level 3

Hello All,

Greetings,

Hope evryone had celebrated christmas and new year well.

I am facing one backup failiure with below error. Please suggest what might be possible causes for this type of failure.

tried below steps.

Consolidation

moved VM to different Datastore

removed lock files if there are any from backup host.

surprisingly backup is completing if i run that VM backup under new Storage unit with accelrator.

Parent Job:

Dec 30, 2018 11:33:12 AM - Error bpbrm (pid=4692) socket read failed, An existing connection was forcibly closed by the remote host.  (10054)
Dec 30, 2018 11:33:29 AM - Error bpbrm (pid=4692) could not send server status message to client
Dec 30, 2018 11:33:30 AM - Info bpbkar32 (pid=0) done. status: 13: file read failed


Sanpshot Job:


do_thaw: Unable to consolidate virtual machine disks, status: 36 (SYM_VMC_TASK_REACHED_ERROR_STATE). Error Details: [Unable to access file  since it is locked].
Dec 30, 2018 10:51:56 PM - Critical bpbrm (pid=9996) from client example: vfm_thaw: method: VMware_v2, type: FIM, function: VMware_v2_thaw
Dec 30, 2018 10:51:56 PM - Critical bpbrm (pid=9996) from client example : snapshot delete returned status 1
 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Sorry for the delay and thank you all for your reply.

issue is resolved after upgrading the vmware tools and vmotioning VM to another datastore

View solution in original post

8 REPLIES 8

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@raeedi239

Please tell us more about the environment where backup is failing.
Different media server to the one where backup is working?
Is media server the backup host as well? 

Have you created any log directories?
bpbkar, bpfis and VxMS on backup host, 
bpbrm and bptm on media server. 

Please show us all text in Job Details.

Different media server to the one where backup is working? 

yes backup is getting completed, when i trigger manual backup with accelrator(

there is no complete backup image match with track journal, a regular full backup will be performed.


Is media server the backup host as well?  yes

Have you created any log directories? yes, bpfis and bpbrm
bpbkar, bpfis and VxMS on backup host, 
bpbrm and bptm on media server. 

 

even after performing consolidation , backup is getting failed and OS team states thet below query.

found that the ‘Release disk lease’ job is not getting triggered after snapshot removal and due to which disk consolidation job is failing.

 

 

Job Desatils:

 

Jan 1, 2019 10:05:32 PM - granted resource  DC2DXI01-SS1-LSU2-DP1-stu
Jan 1, 2019 10:07:10 PM - estimated 832772569 kbytes needed
Jan 1, 2019 10:07:10 PM - begin Parent Job
Jan 1, 2019 10:07:10 PM - begin Application Snapshot: Step By Condition
Operation Status: 0
Jan 1, 2019 10:07:10 PM - end Application Snapshot: Step By Condition; elapsed time 0:00:00
Jan 1, 2019 10:07:10 PM - begin Application Snapshot: Read File List
Operation Status: 0
Jan 1, 2019 10:07:10 PM - end Application Snapshot: Read File List; elapsed time 0:00:00
Jan 1, 2019 10:07:10 PM - begin Application Snapshot: Create Snapshot
Jan 1, 2019 10:07:11 PM - Info bpbrm (pid=11768) Example is the host to backup data from
Jan 1, 2019 10:07:11 PM - Info bpbrm (pid=11768) reading file list for client
Jan 1, 2019 10:07:11 PM - Info bpbrm (pid=11768) start bpfis on client
Jan 1, 2019 10:07:11 PM - Info bpbrm (pid=11768) Starting create snapshot processing
Jan 1, 2019 10:07:11 PM - started process bpbrm (pid=11768)
Jan 1, 2019 10:07:15 PM - Info bpfis (pid=11080) Backup started
Jan 1, 2019 10:07:16 PM - snapshot backup of client Example using method VMware_v2
Jan 1, 2019 10:07:17 PM - Info bpbrm (pid=11768) INF - vmwareLogger:  Creating snapshot for vCenter server dc1vm-vcenc-i01.infra.oa.acn, ESX host dc2bm-esxic-i12.infra.oa.acn, BIOS UUID 421f9acb-d78e-b694-7056-4a3c2fe47d60, Instance UUID 501f6d9c-400e-ff59-0793-034112b1cc86, Display Name Example, Hostname Example.emea.loreal.intra 
Jan 1, 2019 10:07:17 PM - Info bpbrm (pid=11768) INF - vmwareLogger:  Connection state of virtual machine: connected.
Jan 1, 2019 10:07:48 PM - end Application Snapshot: Create Snapshot; elapsed time 0:00:38
Jan 1, 2019 10:07:48 PM - Info bpfis (pid=11080) done. status: 0
Jan 1, 2019 10:07:48 PM - Info bpfis (pid=11080) done. status: 0: the requested operation was successfully completed
Jan 1, 2019 10:07:48 PM - end writing
Operation Status: 0
Jan 1, 2019 10:07:48 PM - end Parent Job; elapsed time 0:00:38
Jan 1, 2019 10:07:48 PM - Info nbjm (pid=4712) snapshotid=Example_1546376830
Jan 1, 2019 10:07:48 PM - begin Application Snapshot: Policy Execution Manager Preprocessed
Jan 1, 2019 10:38:08 PM - end Application Snapshot: Policy Execution Manager Preprocessed; elapsed time 0:30:20
Jan 1, 2019 10:38:08 PM - begin Application Snapshot: Stop On Error
Operation Status: 0
Jan 1, 2019 10:38:08 PM - end Application Snapshot: Stop On Error; elapsed time 0:00:00
Jan 1, 2019 10:38:08 PM - begin Application Snapshot: Cleanup Resources
Jan 1, 2019 10:38:08 PM - end Application Snapshot: Cleanup Resources; elapsed time 0:00:00
Jan 1, 2019 10:38:08 PM - begin Application Snapshot: Delete Snapshot
Jan 1, 2019 10:38:09 PM - Info bpbrm (pid=6352) Starting delete snapshot processing
Jan 1, 2019 10:38:09 PM - started process bpbrm (pid=6352)
Jan 1, 2019 10:38:10 PM - Info bpfis (pid=11608) Backup started
Operation Status: 13
Jan 1, 2019 10:38:55 PM - Critical bpbrm (pid=6352) from client Example: do_thaw: Unable to consolidate virtual machine disks, status: 36 (SYM_VMC_TASK_REACHED_ERROR_STATE). Error Details: [Unable to access file  since it is locked].
Jan 1, 2019 10:38:57 PM - Critical bpbrm (pid=6352) from client Example: vfm_thaw: method: VMware_v2, type: FIM, function: VMware_v2_thaw
Jan 1, 2019 10:38:57 PM - Critical bpbrm (pid=6352) from client Example: snapshot delete returned status 1
Jan 1, 2019 10:38:58 PM - Info bpfis (pid=11608) done. status: 1
Jan 1, 2019 10:38:58 PM - end Application Snapshot: Delete Snapshot; elapsed time 0:00:50
Jan 1, 2019 10:38:58 PM - Info bpfis (pid=11608) done. status: 1: the requested operation was partially successful
Jan 1, 2019 10:38:58 PM - end writing
Operation Status: 1
Operation Status: 13
file read failed  (13)

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You need to check bpbkar and bpfis logs on the backup host where the backup is failing. 
Please ensure that logging level is set to 3.
(Logging level 5 is only needed if you intend logging a Support Call with Veritas.)

I have also recently faced this issue, if the backups which are failing is incremental backups, then try running a full backup from your end for the same VM, it might be caused due to any old track log issue.

Hello Avinash,

 

Thanks for your reply. Yes, As i said earlier backup is getting completed if i trigger full backup. but next scheduled differential backup is failing. So any idea how to prevent this issue.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified


Yes, As i said earlier backup is getting completed if i trigger full backup. but next scheduled differential backup is failing. 

@raeedi239, Actually, you did not, so no wonder we are a bit confused. 

You said this in your opening post: 

surprisingly backup is completing if i run that VM backup under new Storage unit with accelrator.

Above statement makes it look like backup fails when running to 'old' STU (old media server), successful when running to 'new' STU ('new' media server). 

You also replied 'yes' to this question :

Different media server to the one where backup is working? 

yes .....

Are Full and Diff backups running on different STUs / media servers? 

Have you had a look yet at bpfis and bpbkar on backup host (media server)? 

We are seeing similar issues on one of our VMs. Suddenly the VM won't backup via MSDP accelerated Backup. B2D is functioning fine. Additionally it seems that the GRT processing is really slow.

Other VMs are running fine through backup.

We tried to expire the whole MSDP catalog running the backups over a month on B2D, but after the first backup on MSDP the issue reappeared immedeately doing the accelerated backup.

At the moment we leave the backup for this VM on B2D.

Sorry for the delay and thank you all for your reply.

issue is resolved after upgrading the vmware tools and vmotioning VM to another datastore