A VMware and Symantec Technical White Paper
Challenges Using Traditional High Availability Solutions
Within any business, a number of applications exist that are critical to the success of the business. As a result, these applications and the systems they run on require a higher level of availability. One of the most common methods used to increase the availability of a business-critical application in a physical environment is to deploy a traditional high availability clustering solution as Veritas Cluster Server(VCS) or Microsoft Cluster Server(MSCS). These solutions protect against an unplanned failure of a component by providing the ability to restart an application or set of applications on additional servers in the cluster. While typically associated with providing protection against unplanned server failures, these solutions can also be utilized to decrease the effects of a planned outage by shifting applications to redundant servers to allow for maintenance of the original server.
The trade off with increasing application availability through traditional high availability clustering is an additional cost in terms of redundant hardware, clustering software/support, and additional complexity. Increased management costs are also realized due to the need to maintain multiple systems that are identical in configuration and patch levels. Operationally, it is extremely difficult to deploy a limited number of spare servers to provide redundancy for a larger set of applications due to difficulties with application compatibility, server patch levels, and so on. This typically results in the use of small two-node clusters deployed for only the most critical applications, leaving the majority of applications not clustered at all.
High Availability in VMware Environments
As customers move forward with VMware® virtualization solutions, they recognize an immediate set of benefits far beyond a simple reduction in servers. VMware includes proven and widely deployed business continuity solutions in VMware vSphere™ 4.1 (“vSphere”) in the form of VMware vMotion™, VMware High Availability (VMware HA) and VMware Fault Tolerance (VMware FT).
Utilizing VMware’s revolutionary vMotion technology, IT administrators are able to move applications for server maintenance with zero downtime and data loss. Coupled with the operating system isolation natively provided by VMware virtualization, it is very simple to provide a small set of highly consolidated servers capable of providing very high uptime with reduced administrative cost.
VMware HA provides a simple, reliable way to increase the availability of virtual machines hosting critical applications. VMware HA is a virtualization-based distributed infrastructure service of VMware vSphere 4.1, which monitors the health of virtual machines and the VMware ESX® hosts upon which they reside. If a fault is detected, the virtual machine is automatically restarted on another ESX host with adequate capacity to host it. VMware HA is included in all vSphere editions and can be enabled on a VMware cluster with a single check box. As VMware HA utilizes the storage and network connectivity already in place to support vMotion, enabling high availability is as simple as ensuring you have adequate server capacity to handle failure of one or more ESX hosts.
VMware FT extends the capabilities of VMware HA to provide even higher levels of availability for mission-critical applications by allowing instantaneous transfer of services to a secondary image of VMware FT–enabled virtual machines. This allows virtual machines to continue operations even when a server failure occurs with zero downtime and user interruption.
Challenges in Virtualizing Tier 1 and Tier 2 Applications
VMware HA and VMware FT technology provide increased availability for a large percentage of customers. In fact, more than 80 percent of VMware customers leverage one or both of these technologies to protect most or all of their virtual machines. However, a method to increase the availability at the application layer is often desired, especially for business critical Tier 1 and Tier 2 applications. Without this protection at the application level, organizations are exposed to application failures that might happen inside the virtual machine. In many cases, organizations have attempted to deploy a traditional application-clustering solution into the virtual machine’s guest operating system for this purpose.
While deploying a traditional application-clustering solution into the virtual machine addresses failures at the application layer, it also creates significant issues with the day-to-day operations of a virtualized environment as these solutions were designed for physical environments. These issues include the added complexity of maintaining multiple identical virtual machines to properly host failover, additional capacity needed to host
spare servers, and difficulty in mapping application location to a specific virtual machine within the VMware management solution. More important, the addition of in-guest clustering significantly impacts the ability to make use of advanced VMware features such as vMotion, VMware HA, VMware Distributed Resource Scheduler (VMware DRS), and VMware Distributed Power Management (VMware DPM).
Extending Application High Availability in vSphere 4.1
With the release of vSphere 4.1, VMware is introducing an application programming interface (API) to allow third-party software vendors to deploy application monitoring components inside a VMware guest OS and inform VMware HA when problems arise. This API will allow application clustering vendors to develop application monitoring and control solutions that fully complement the virtual machine high availability and
management provided by vSphere.
The joint solution will include two layers of protection. The first is the in-guest protection provided by the application HA vendor. This application-layer protection can include application-specific capabilities such as component-level monitoring, restarting of failed services, performance monitoring, and so forth. The second layer is VMware HA, which can restart the virtual machine in cases where the in-guest solution cannot resolve the issue.
The user is able to enable application monitoring as a part of the VM Monitoring Status section of the VMware HA settings. Enabling application monitoring allows the application-monitoring solution to register with the VMware application awareness API and communicate application status with VMware HA. Inside vCenter Server, the user will be able to determine which virtual machines are monitored on the application level and which ones are only monitored for basic virtual machine health. In Figure 1 we see how the user can control the monitoring level in the cluster and per virtual machine.
TO READ THE COMPLETE WHITE PAPER, DOWNLOAD THE ATTACHED PDF.