IHAC that has configured a central SPA (5 nodes non HA) in ESX4.1 (3 ESX Hosts) and has 80 remote office locations replicating to the central SPA.
Node breakdown as below
All-in-one PureDisk server with a storage capacity of 500 GB to 1 TB PureDisk VMware guest running in a VMDK file on top of a VMFS partition on the internal or direct attached storage Typical client agent count: up to 10 systems
ESX Infrastructure dedicated for PureDisk VMware guests Multi-node PureDisk server with a storage capacity up to 8 TB 3 to 4 PureDisk VMware guests using Raw Device Mapping to access the storage
ESX Infrastructure dedicated for PureDisk VMware guests Multi-node PureDisk server with a storage capacity up to 24 TB 3 to 10 PureDisk VMware guests using Raw Device Mapping to access the storage
The PureDisk all-in-one VMware guest requires the following ESX resources: 2 vCPU 4 GB RAM 500 MB to 1 TB on the VMFS storage partition
The PureDisk VMware guests require the following ESX resources: 4 to 6 vCPU 8 to 12 GB RAM up to 8 TB of internal or direct attached storage accessed using Raw Device Mapping
The PureDisk VMware guests require the following ESX resources: 4 to 16 vCPU 8 to 32 GB RAM up to 24 TB of FC SAN based storage accessed using Raw Device Mapping
I think that spreading PD Storage Pool across VMs is not the best idea. I would suggest one physical machine with dedicated physical storage, 32GB RAM works for 20TB of CR storage.
On the other hand PD does not use more than two cores. So, using more than one machine is something like force implementation of multi-threading. Perhaps using a dedicated "/Storage" volume located on a separate physical array per each machine would be the most efficient sollution. 2 vCPUs per machine may be enough.
Any more numbers/stats? How much data is in the scope of this backup?
Personally I think this is a bad idea. Although it might work, If you have a hardware failure on your virtual infrastructure and you need to start doing restores (possibly of entire VM's) your stuck. Backup systems should be seperate to your virtual systems.
As f25 said, one decent system with a tray of disk (Dell R710 + MD1200 tray) or the like would be much more viable an option.
Well, I assisted a customer a while long ago with rolling out PD 6.6.1 on ESX running on a dedicated blade node for use as PDDO with NBU.
We had a lot of issues going with the clustered option and had to abort that design. Installing non-clustered works better, but the VMDKs created must be created with "eagerzeroedthick". Normally only a requirement for disks used in cluster, but the PD install wizard choked on thick only disks. PDOS is also installed differently from node to node, and disk numbering is random. It seems to be safest to only present one disk at PDOS installation for OS install, and then once that part of the installation is finished, to add remaining disks to be used for /Storage.
The other main issue is with networking; PDOS only has a driver for E1000 NIC type, which is not recommended by VMWare. So the installation has to be done with E1000, then install VMWare tools, followed by manual re-configuration of the networking in PDOS. It's messy as it seems the PDOS installer scripts gets "confused" after the network re-configuration.
Performance; well, the three PD nodes were on a dedicated blade, so it worked quite O. And as the storage wasn't size wasn't massive (4TB), the disk I/O was acceptable after seeding the pool first. Re-hydration during restore and duplication is of course a bit slow, but then again re-hydration is slow as it is, to start with... If I remember correct, we saw something like 8-12MB/s during restore. But as always, there are so many other variables affecting restores, so I wouldn't completely blame the virtualized PD for that.
It is difficult to say whether the amount of issues encountered during production was due to the setup in a virtual environment, or just the normal instability of PureDisk, or a combination of both, but the net result was a quite unstable PDDO disk pool in NBU.
Based on the experience from this engagement, I would not design and deploy this sort of configuration for any production use again. It is fine for lab/demo/test though.