cancel
Showing results for 
Search instead for 
Did you mean: 

VM Snapshot Error

Kev_Lamb
Level 6

Environment:

NBU 7.7.1 Master
NBU 7.7.1 Media
RHEL 5.6 O/S
VMWare 5.5

Client with error RHEL7 running vm-open-tools

Hi,

I am having an issue with the snapshot backups of a RHEL7 VM client that is running vm-open-tools (VMWare ones not available), the snapshot fails to quiecence the drives, if I run a non quiecence this works fine, the client does have a CIFS mount mapped to it from the OS

Looking in the NBU logs I dont really see that much, well nothing that I can make out, in the client /var/log I see the following: error

 

Jan 21 11:08:11 lonbfbvmwwing4 kernel: CIFS VFS: cifs_mount failed w/return code= -13

Jan 21 11:08:11 lonbfbvmwwing4 vmsvc[710]: [ warning] [vmbackup] Error freezing filesystems.

Jan 21 11:08:24 lonbfbvmwwing4 vmsvc[710]: [ warning] [vmbackup] Error freezing filesystems.

I have looked at the CIFS mount on the client and this looks Ok however it is a HDI mountpoint.

Was wondering if this could be a VSS issue but I am unable to run an advanced install of the vm-open-tools to check if this option is selected, if anyone has any experience with vm-open-tools it would be appreciated.

I am able to take a quiecence backup usning VM directly but not via NetBackup

Any help with the above would be appreciated.

Kev

Attitude is a small thing that makes a BIG difference
1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified
I believe that no VM-level should be taken of db servers. Unless it is one of the support MS db's. What is the backup strategy with MySql db's? Speak to dba's - NBU supports agent sold by Zmanda. So - rather install NBU client and treat as physical client.

View solution in original post

14 REPLIES 14

Douglas_A
Level 6
Partner Accredited Certified

HI Kevin, 

when you perform a snapshot with NBU what is the error from the VC showing as?

Kev_Lamb
Level 6

Hi Doug,

In the vSphere client window I get the following error message

An error occurred while taking a snapshot: msg.snapshot.error-QUIESCINGERROR.

 

Kev

Attitude is a small thing that makes a BIG difference

areznik
Level 5

I think you need the symcquiesce utility installed. It supposed to work with vmware tools to quiesce the machine. 

Read about it here:  https://www.veritas.com/support/en_US/article.HOWTO70978

In the requirements section it says that it needs VMWare tools, not sure if it will work with the open tools you mentioned. 

Will_Restore
Level 6

Original log contains what looks like permissions issue

kernel: CIFS VFS: cifs_mount failed w/return code= -13

make sure the mount is really valid and accessible

 

Dangerous_Dan
Level 5
Partner Certified

Hi Kevin,

Silly question but have you tried running a quiesced snapshot with the CIFS share unmounted?

 

This certainly does sound more like a VMware issue rather than something that Netbackup is failing to do.

Kev_Lamb
Level 6

Hi Areznik, just installed the SYMCquiesce but same error.

 

Hi Will, I am able to access the mount point with out any issues

Just wondering if it is something to do with the vm-open-tools.....

Attitude is a small thing that makes a BIG difference

Kev_Lamb
Level 6

The plot thickens, just found this in the SYMCquiesce log file

Build version: SYMCquiesce 1.0.0-003 Tue May 13 17:42:53 IST 2014
Stats - Thu Jan 21 18:26:02 2016

Freeze of volume [/mysql-data] returned status [-1]
Thaw of volume [/mysql-data] returned status [-1]

Looks like it may be the mysql area

 

Kev

Attitude is a small thing that makes a BIG difference

Marianne
Level 6
Partner    VIP    Accredited Certified
I believe that no VM-level should be taken of db servers. Unless it is one of the support MS db's. What is the backup strategy with MySql db's? Speak to dba's - NBU supports agent sold by Zmanda. So - rather install NBU client and treat as physical client.

Kev_Lamb
Level 6

Thanks Marianne,

This is a new server for an application being bought into the company, I will speak to the server owners and find out how they will be managing the MySql DB backups.

I will be moving this over to a standard client rather than a VM policy based backup

Thanks to everyone who has helped with this issue.

Kev

Attitude is a small thing that makes a BIG difference

sdo
Moderator
Moderator
Partner    VIP    Certified

Apologies Kevin, I don't mean to teach you how to suck eggs... the above is perhaps more for our newer, and perhaps less experienced members, who may stumble across this topic... in how to think "enterprise" backup... after all, this product is named:   NetBackup Enterprise Server.

And the above behavioural characteristics at the VM layer is probably still true for any backup product... until you get into the realms of advanced storage array integration.

sdo
Moderator
Moderator
Partner    VIP    Certified

my 2 cents...

.

Recap – the general styles of backup are:

A) Plain file system agent ONLY                           (skipping raw database files - but securing a copy of dump/export/datapump/RMAN-on-disk).

B) Plain file system agent PLUS database agent (skipping raw database files).

C) VM style, either “whole VM with no file cataloguing” or “whole VM with file cataloguing” ONLY.

D) VM style, either “whole VM with no file cataloguing” or “whole VM with file cataloguing” PLUS leveraged database agent.

(although there are others - e.g. flashbackup, and other complex snapshot based technologies)...

.

The rest of my text is pertinent to item C) above.

.

The issues with backup style C) above, for VM style backups of medium to large database servers… are:

  1. This captures what is highly likely to always be an incomplete snapshot of the live open raw database files within folders inside the virtual guest server (i.e. the effectice backup client).  The backup image will contain raw database files which will be crash consistent only, and will very likely be unusable/corrupt when restored.
  2. In practice, some databases can be fairly large, and so this style of backup (i.e. VM style) can cause a significant amount of unnecessary IO at the SAN and/or LAN layers by reading blocks (from within VMDK/VHD files), transferring these blocks, and writing these blocks (to backup storage)… which will very likely by useless after restore.
  3. Causes unnecessary backup IO at the storage layer, i.e. between ESX/Hyper-V hypervisor hosts and the storage (SAN, iSCSI, NFS) - i.e. wastimg time and resource reading and sending and writing the blocks of files which will be useless when restored.
  4. Causes unnecessary snapshot IO…
    • for VMware at the VMDK layer – the IO impact cost “during” backups is low because all updated/new blocks within the database will be being ‘stored’ (written to a write pending log) during VMware VADP style backups (i.e. an extra read is NOT taking place).
    • for Hyper-V at the VHD layer – the IO impact cost “during” backups is high because all updated/new blocks within the database will be being ‘vectored’ (any prior blocks must be read before written to a VSS delta change log) during Hyper-V VSS style backups (i.e. an extra read IS taking place).
  5. Causes unnecessary ESX/Hyper-V host CPU and IO at the VMDK/VHD layer for:
    • for VMware the IO cost penalty is at the end of the backup… for consolidation when the VMware snapshot is removed, i.e. the full list of pending writes (which were queued up during the backup) must now be applied for real to the VMware VMDK files.
    • for Hyper-V the IO cost penalty is during backups… to capture the VSS delta change log, with minimal IO at then of the backup when the VSS snapshot is imply discarded/deleted - i.e. there is no need to re-process the VSS "delta change log" because all it contains is the old blocks as the used to look like before the writes (which occured during the backup) took place.
  6. The busier the database the bigger the snapshot gets during backups.
  7. The bigger the VM is, the longer the backup takes, and so the longer the snapshot is active, and the longer it has to capture IO, and so the bigger the snapshot gets.  Nasty.
  8. Causes unnecessary IO at the LAN and/or SAN layer (between ESX/Hyper-V hosts and the backup host), i.e. sending lots of blocks which are essentially useless.
  9. Causes unnecessarily lengthy backup job duration which could be freed up for backup scheduling resources for other backups.
  10. Causes wasted backup target storage:
    • If tape, then wasted tape, and if your tape is FC SAN and SAN switches, then this is effectively wasted traffic across FC SAN switch ports.
    • If basic disk, or advanced disk – then wasted disk space - again also possibly across some kind of storage connectivity layer.
    • If de-dupe disk - then both wasted disk space AND wasted time, CPU, RAM, disk for “fingerprint hashing” and “fingerprint storage and recall”.
      • …and it is highly likely that the “changed” blocks inside “raw database file blocks” will nearly always appear to be unique to de-dupe… and so this problem is a significant problem, i.e. lots of de-dupe activity for something which is essentially useless.

.

Having said all that.  Then, if your VMs contain small non-volatile databases (which are doing their own dumps/exports/data-pumps/RMAN)... AND you are taking your VM style backup AFTER these database application dumps/exports/datapumps/RMAN have occured, then there really is no problem doing VM style backups of said small virtual database servers - as long as you remember to delete/discard the restored raw database file before attempting recovery from the dump/export/datapump/RMAN backup which should have also been captured within the VM style backup (as long as the backup ran after said dumps had completed).

In summary, the problem gets worse as the virtualised database servers get both larger and/or more volatile - with posisbly huge amounts of wasted IO, and resource, at multiple and various different times and stages, for files which will very likely be useless upon restore.  In which case, it really does become a very good idea indeed to use either backup type A) or backup type B) above... or a backup of type D) above (but even type "D" above will still accrue lots of "snapshot related" IO).  The only way to avoid VM level snapshot files growing very large during backups is to use backup type A) or B) above.

areznik
Level 5

Great writeup sdo, im bookmarking this for later. I think you should repost this as a blog for more visibility.

One thing that always frustrates me with these backups is that you often see servers built by people unaware of this problem, and they will put the active DB and the dumps on the C:\ drive with the OS. Now if I want to backup the OS with snapshots, I have no choice but to snap the DB as well, and end up running into a lot of the problems you described. AFAIK excluding *.mdf and *.ldf doesnt really help, as these files still end up getting snapped. Do you know of any other clever solutions for this kind of scenario, other than asking the server owners to move their stuff to a different drive? 

Marianne
Level 6
Partner    VIP    Accredited Certified
Totally agree! Blog, please sdo!

Kev_Lamb
Level 6

SDO, no apology needed, great write up and certainly a reference document that should be blogged 

 

Kev

Attitude is a small thing that makes a BIG difference