cancel
Showing results for 
Search instead for 
Did you mean: 

Vmware backups with multiple disks fail with status 12

matt077
Level 5
Partner Accredited

Hi all,

I seem to have a odd, problem, Vmware clients which have more then one disk fail with status 12 file open failed. We can see the boot disk gets backed up but when it starts to backup the data disk (2nd drive) the backup fails with status 12.  From the logs we see the below -

05/02/2017 13:53:23 - Error bpbrm (pid=100388) from client clientname.com: ERR - Cannot open file <vix>[vmware-storage] clientname/clientname_1-000002.vmdk error = 25

If we run a backup of just the data disk or just the boot disk then the backup runs fine.

Master, media and client is running NBU version 7.7.3. Master and media server is Windows 2012 R2 standard and client is solaris 11.3

Anyone else had this problem at all? suppport and google have all come to a dead end currerntly

For a temp fix, we have created two vmware policies, one for data disks and one for boot disks..... not ideal really.

1 ACCEPTED SOLUTION

Accepted Solutions

matt077
Level 5
Partner Accredited

Seems we found the problem. Seems 7.7.3 uses a version of VDDK (6.0.0) which has a bug where VixDiskLib crashes with our version of VMware.

VDDK version 6.0.2 is where this was fixed. This code wont be implement until Netbackup 8.0

update from Support -

Looking at above error this is a VixDiskLib crash problem, which Vmware has identified and fixed it.
It is a VMware issue that they recently fixed and we won't implement the VMware code until NetBackup 8.0.

The errors we saw was the same in the below Tech doc -

https://www.veritas.com/support/en_US/article.000108181

View solution in original post

10 REPLIES 10

Thiago_Ribeiro
Moderator
Moderator
Partner    VIP    Accredited

Hi,

Give us more information about your environment, Vcenter version, Transport Mode etc.

 

Thanks.

matt077
Level 5
Partner Accredited

Vcenter is running version 6.0.0.20100 and the ESXi hosts are running 6.0.0, 4192238
Transport type is nbd which is being used for this policy.

We have a number of VMs we backup which run fine. We have about 5 tho which keep failing with status 12.

 

Thiago_Ribeiro
Moderator
Moderator
Partner    VIP    Accredited

Hi,

Have you confirmed that the Linux version on the VM is supported?

See Supported VMware/Hyper-V guest operating systems in Statement of Support for NetBackup 7.x in a Virtual

https://www.veritas.com/support/en_US/article.000006177

Im not sure if this version is suported. As you said this problem happens only with small number of VMs maybe this could be the problem.

 

Thiago

matt077
Level 5
Partner Accredited

The only thing which we have changed is we upgraded from Netbackup 7.6.1.2 to 7.7.3.

We have now worked out that the VM's which are failling have more then 1 disk. When the backup starts the backup of the 2nd disk, the job fails with status 12. We can backup the boot disk with one policy and the data disk on the a different policy which work find until we hit a client which as got 3 disks.

Are there any timeouts for the VMware snapshots with in NBU or VMware?

Thiago_Ribeiro
Moderator
Moderator
Partner    VIP    Accredited

Hi,

I did a simulation on NBU Sort to check if there is something with this version.Its doesnt make sense 1 disk works and 2 disk no...

Look this.This patch was apply in your Vcenter?

VM.JPG

 

Do your policies have Block Level Incremental backups enable? If yes, check these links.

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21368...

Attached there is the report that was generated from nbu sort.

Other thing, you said that VMs with more than 1 disk are failing, did you check the communication between the media server and ESx Hosts where these VMs are hosted?

The tech below is about best practices according with the transport mode.

https://www.veritas.com/support/en_US/article.000094725

 

matt077
Level 5
Partner Accredited

Thanks for the update. I've shown them what you have posted and awaiting a update from the vmware team on this.

We ran this for 2/3 days in 7.6.1.2 then i upgraded to 7.7.3 to keep inline with support. Everything seem to work fine in 7.6 with regards of VMware backups.

matt077
Level 5
Partner Accredited

Seems we found the problem. Seems 7.7.3 uses a version of VDDK (6.0.0) which has a bug where VixDiskLib crashes with our version of VMware.

VDDK version 6.0.2 is where this was fixed. This code wont be implement until Netbackup 8.0

update from Support -

Looking at above error this is a VixDiskLib crash problem, which Vmware has identified and fixed it.
It is a VMware issue that they recently fixed and we won't implement the VMware code until NetBackup 8.0.

The errors we saw was the same in the below Tech doc -

https://www.veritas.com/support/en_US/article.000108181

Thiago_Ribeiro
Moderator
Moderator
Partner    VIP    Accredited

Hi, Im facing a problem with this VMware version, I have a VM with more than 2 TB, where thousands events called Map Disk Region are generated, causing an unnecessary comsuption of disk space and finally the Vcenter server is unavailable.

@matt077, thanks for share with us this information

     ReleaseNotes_VM.png

Release Notes 6.0.1 - https://www.vmware.com/support/developer/vddk/vddk-601-releasenotes.html    

Release Notes 6.0.2 - https://www.vmware.com/support/developer/vddk/vddk-602-releasenotes.html            

Release Notes 6.0.3 - https://www.vmware.com/support/developer/vddk/vddk-603-releasenotes.html            

Technote Veritas – https://www.veritas.com/support/en_US/article.000082809             

Technote VMware – https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10073...

Regards,

 

Thiago

flower108
Level 1
Partner Accredited

This prob is still there in NBU 8.0

Oct 1, 2017 2:19:40 AM - Info nbjm (pid=17503) starting backup job (jobid=5703622) for client kronosmobpd02, policy COPY-VM_Linux_Lex_Poor, schedule Full
Oct 1, 2017 2:19:40 AM - estimated 0 kbytes needed
Oct 1, 2017 2:19:40 AM - Info nbjm (pid=17503) started backup (backupid=kronosmobpd02_1506838780) job for client kronosmobpd02, policy COPY-VM_Linux_Lex_Poor, schedule Full on storage unit MSDP_lexnbupd01a using backup host lexnbupd01a
Oct 1, 2017 2:19:40 AM - started process bpbrm (pid=155024)
Oct 1, 2017 2:19:41 AM - connecting
Oct 1, 2017 2:19:41 AM - connected; connect time: 0:00:00
Oct 1, 2017 2:19:56 AM - Info bpbrm (pid=155024) kronosmobpd02 is the host to backup data from
Oct 1, 2017 2:19:56 AM - Info bpbrm (pid=155024) reading file list for client
Oct 1, 2017 2:19:56 AM - Info bpbrm (pid=155024) starting bpbkar on client
Oct 1, 2017 2:19:56 AM - Info bpbkar (pid=155040) Backup started
Oct 1, 2017 2:19:56 AM - Info bpbrm (pid=155024) bptm pid: 155049
Oct 1, 2017 2:19:57 AM - Info bptm (pid=155049) start
Oct 1, 2017 2:19:57 AM - Info bptm (pid=155049) using 262144 data buffer size
Oct 1, 2017 2:19:57 AM - Info bptm (pid=155049) using 30 data buffers
Oct 1, 2017 2:19:58 AM - Info bpbkar (pid=155040) INF - Backing up vCenter server ecsvcpd03.corp.ad.fmcna.com, ESX host dmzbosesx02.corp.ad.fmcna.com, BIOS UUID 42067a9e-1ff4-d7d9-5d96-1effc83632c1, Instance UUID 50062b88-8705-404a-37c4-3cfc3bd7bca6, Display Name kronosmobpd02, Hostname kronosmobpd02
Oct 1, 2017 2:20:09 AM - begin writing
Oct 1, 2017 2:20:18 AM - Info bptm (pid=155049) start backup
Oct 1, 2017 2:22:41 AM - Info bpbkar (pid=155040) INF - Transport Type = nbd
Oct 1, 2017 4:43:49 AM - Error bpbrm (pid=155024) from client kronosmobpd02: ERR - Cannot open file <vix>[dmzbosvmax01_03] kronosmobpd02/kronosmobpd02_1-000002.vmdk error = 25
Oct 1, 2017 4:43:50 AM - Critical bpbrm (pid=155024) from client kronosmobpd02: FTL - cleanup() failed, status 12
Oct 1, 2017 4:43:52 AM - Error bptm (pid=155049) media manager terminated by parent process
Oct 1, 2017 4:46:17 AM - end writing; write time: 2:26:08
Oct 1, 2017 4:46:29 AM - Info lexnbupd01a (pid=155049) StorageServer=PureDisk:lexnbupd01a; Report=PDDO Stats (multi-threaded stream used) for (lexnbupd01a): scanned: 41473681 KB, CR sent: 697863 KB, CR sent over FC: 0 KB, dedup: 98.3%, cache disabled
Oct 1, 2017 4:46:32 AM - Critical bpbrm (pid=155024) unexpected termination of client kronosmobpd02
Oct 1, 2017 4:46:32 AM - Info bpbkar (pid=0) done. status: 12: file open failed
file open failed (12)

Marianne
Level 6
Partner    VIP    Accredited Certified

Can you confirm that you are experiencing the same issue as @matt077  (https://www.veritas.com/support/en_US/article.000108181) ?

Have you checked VxMS logs?

According to above TN, the issue seems to be with VMware:

Due to known VMware issue in VDDK....

Cause

When many VMs are backed up concurrently, VDDK can crash occasionally.
When many virtual machines are backed up in parallel, VixDiskLib might crash. The issue is not consistently reproducible, but was determined to be a problem with null entries in libgvmomi. This issue has been fixed in this release.

See alsohttps://www.vmware.com/support/developer/vddk/vddk-602-releasenotes.html