Backup Exec multi-server environment, aka CASO-MBES environment, is designed to cater to varied customer requirements in both LAN and WAN configurations. Many large organizations deploy CASO-MBES environments to take advantage of job delegation and automatic load balancing of jobs across different storage devices on the managed media servers, which help businesses avoid bottlenecks and meet strict backup windows.
CASO-MBES uses an IPC message queue mechanism for communication among the servers on both IPv4 and IPv6 networks. The communication between the servers happens over a secure channel by encrypting the data over the wire using TLS/SSL. Information related to jobs, job status, and alerts is sent over the message queue.
The system is designed to even support remote offices with limited network bandwidth, however, the faster the network, the better the Central Admin Server Option (CASO) environment will run. While Backup Exec can work with both static and dynamic IP, we have seen customers configure static IP on all media servers for better DNS resolution and more reliable communication.
For more details on the recommendations for configuring and managing CASO environment, refer to the following technote: https://www.veritas.com/support/en_US/article.100018502.
In order to help customers who are relatively new to CASO-MBES environments, we would like to share some best practices around configuring and deploying these environments.
If you are planning to use CASO-MBES environment for the first time, here are a few simple things that can help to set up the environment correctly:
Backup Exec is capable of dealing with slow or flaky networks. However, when network errors continue to occur Backup Exec flags the communication state to “Stalled” or “No Communication”. Communication between the servers is crucial for job delegation and for using any devices that are shared with other managed servers. If the communication between the servers is disrupted, then the devices attached to MBES will go into “No communication” state, which will affect any jobs targeted to those devices.
Often communication issues are a result of a non-optimal network configuration and/or other external factors such as antivirus, firewalls, etc. Here are some basic best practices that can help avoid communication issues:
In some customer environments, temporarily disabling firewall and antivirus and running the netstat command has helped to isolate communication issues. If the issue is related to the firewall or antivirus, setting exclusion rules within the firewall or antivirus will help.
To get the list of open ports and its status, use the netstat command. Refer to the screenshot below.
Ensure that the network meets the following latency considerations:
Recently we have seen customer-facing intermittent connectivity issues. DNS resolution was working correctly and exclusions rules for ports were properly configured in firewall and antivirus. We checked for packet loss and found a significant number of packets were lost during communication. There are different tools available to detect packet loss, but we used a simple ping test to detect the issue.
The following screenshot displays the ping command with the echo count option [-n count] to detect packet loss. In the screenshot, for example, 25% of packets were lost with an average latency of 51ms.
For intermittent connectivity issues, possible explanations to check are faulty network cables, routers, and network cards. It is also recommended to turn off any power-saving settings within these devices.
Update the firmware and drivers to the latest version for each server's hardware (SCSI/RAID, HBA, Network Card(s), and so on), especially for cases where a SAN is involved in the CASO setup. Also, confirm that the latest Microsoft Service Pack and patches have been applied to each machine. To check for available Windows updates and patches, run Windows update.
Another important communication within the CASO system is the exchange of catalog files. There are different catalog modes that govern the transfer of catalog files between the servers. It is important to choose the correct catalog mode depending on network conditions. Backup Exec has multiple catalog modes to cater to the varied needs of different types of customer environments.
The distributed catalog mode is the default and is efficient and suitable for a CASO environment that uses a wide area network (WAN) or a slow network connection. Replicated and centralized catalog modes are suitable when the environment has a stable, high-bandwidth network between the central administration server and the managed Backup Exec servers. For more information, refer to https://www.veritas.com/content/support/en_US/doc/72686287-139301530-0/v71453368-139301530.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.