Forum Discussion

JRV2's avatar
JRV2
Level 4
10 years ago

Guest vs. host dedupe

BE2012 SP4 Vray on WS2008 R2.

WS2012 Hyper-V host with WS2012 Hyper-V guests, agent installed in the guests.

Experimenting with deduplication. Using client-side dedupe. Set up a dedupe device and backed up a guest to it, from the guest, a dozen or so times. Got about 6:1 compression. Then changed the backup job so we were backing up the guest from the host to the dedupe device. Got 1.4:1. Added another WS2012 guest, backing up from the host. Compression remained 1.4:1.

Is it expected that dedupe compression will be very low when we back up from the host instead of from the guest?

  • The dedup ratioos will be different as one is backup up the VHD files (using a stream handler and then doing a GRT operation) and the other is backing up the data within the file system of the VM (so inside the VHDs) using a nromal file system dedup operations.

    Because at any point in time all sorts of random changes could happen inside an active VHD I would expect the vhd not to deduplicate as well although the exact ratio difference would probably be difficult to predict and I am therefore not sure that the values you quoted are expected.

     

    That said if you do only agent based backups of data inside a VM, a full DR operation would be more complicated that tha doing the VM itself. There may also be performance gains for both backup and restores.

    As such even with the decrease in dedup ratio you might still want to do VM backups because of the other advantages.

4 Replies

  • Yes. There is a document which I don't have at the moment which states that VM dedup poorly
  • PKH, thanks for your reply!

    Does it improve if I use a dedupe appliance?

    Does it improve with BE2014?

  • The dedup ratioos will be different as one is backup up the VHD files (using a stream handler and then doing a GRT operation) and the other is backing up the data within the file system of the VM (so inside the VHDs) using a nromal file system dedup operations.

    Because at any point in time all sorts of random changes could happen inside an active VHD I would expect the vhd not to deduplicate as well although the exact ratio difference would probably be difficult to predict and I am therefore not sure that the values you quoted are expected.

     

    That said if you do only agent based backups of data inside a VM, a full DR operation would be more complicated that tha doing the VM itself. There may also be performance gains for both backup and restores.

    As such even with the decrease in dedup ratio you might still want to do VM backups because of the other advantages.

  • As regards the dedupe vs. non-dedupe ratios, the 2 VMs I'm using for testing barely change from day to day. One is not even in production and runs only WS2012 and antivirus. The other WS2012 VM is a print server. If anything, I'd expect these 2 nearly-idle servers to yield much higher compression than most servers.

    I guess what I'm figuring out is that it would make the best use of available media to make a weekly, non-dedupe host backup to tape for DR purposes on weekends, which can be taken offsite, and daily guest backups to the dedupe device to be used for routine restores.

    Thanks for your help!