09-23-2022 07:20 AM
In this day and age, data tends to only increase in size. For our customers with ever-growing Oracle databases (DB), timely backups and restores are a challenge. We have many existing features within NetBackup to protect Oracle, and now we have added a solution for Oracle Very Large DBs (VLDB).
Figure 1. Oracle policy option to select multiple MSDP storage units
The designation of “Very Large” is arbitrary, but a widely accepted definition is a database that is too large for the data protection to succeed in the desired time window. Oracle DB protection struggles are focused on completing backups faster, but the ability to restore within the expected time frame is often ignored, resulting in missed Restore Time Objectives (RTOs).
This new Oracle policy option allows segmenting the Oracle backup across multiple NetBackup MSDPs (Media Server Deduplication Pools) storage units with the ability to choose which storage units are used (see Figure 1). For a single stream, this results in a backup image that is in the catalog as a single image, with the image’s fragments tracked in the catalog on each storage unit selected. A single backupid makes tracking and reporting streamlined. You can also increase the number of parallel streams. Allowing simultaneous writes to multiple disk pools increases the efficiency of the streaming backup.
The number of storage units to use for the best results will vary from one database to the next. Further, take advantage of multiple parallel streams to further tune your Oracle backup. Storage units are linked to disk pools, and the most effective use of this option will leverage multiple storage units that are linked to multiple unique disk pools hosted on different media servers. This solution also works with where the nodes of a single Flex Scale are managed independently and only one storage unit is presented to NetBackup.
There will be affinity for the database backup to write the same file, or piece, of the database backup to the same MSDP storage unit, so do not change this configuration often.
Some of the considerations will be:
As more parallel streams and storage units are configured for the policy, the gain in performance is geometric improvements in the backup times. This must be aligned to your backup and restore goals and your existing infrastructure to avoid creating bottlenecks in the performance. To meet the desired goals, there may be a need to add more MSDP pools rather than having a few large pools each with a single node. Additionally, consider more load balancing media servers to also share data movement responsibilities.
This solution can also use Storage Lifecycle Policies (SLPs) as multiple storage targets, enabling you to maintain your current business continuity and DR (Disaster Recovery) plans for your Oracle data. When selecting multiple storage units, you would select a different SLP (Service Lifecycle Policy) for each destination. If the desired SLP is not shown, confirm that it is using supported MSDP storage units. It is key that the SLPs for this use-case all be designed and configured with the same goals, including retention levels, and different source and target storage units. For example, if splitting the Oracle backup across 2 SLPs, each SLP would use a different backup storage unit, and a different secondary operation storage unit. In the case of replications (AIR (Auto Image Replication)), the replication relationship between the on-prem MSDP and the target MSDP each needs to be under the same target primary server (see Figure 2). It is possible to replicate many-to-one, but this would remove the benefit of segmenting the image across multiple disk pools. If the replication targets of only a portion of the database went to a different NB domain or were not retained for the same period, the image would not be complete and a restore would not be possible.
Figure 2. SLP configuration requirement for multiple MSDP storage units with replication
When the need for a restore arises, NetBackup takes advantage of multiple sources to read and stream back to the destination, with each disk pool reading a piece of the database image simultaneously. This results in a faster restore time for a large amount of data. The Disk Pool’s Maximum I/O streams setting will need to be adjusted according to the peak performance of the disks and the potential number of restore and backup streams. This setting can be changed dynamically, without a restart of services, to meet changing needs.
Consider, also, the impact of load balancing media servers in such a configuration. If all media servers have credentials to all storage servers, they can potentially perform the requested read action during a replication or restore operation. In circumstances where some media servers are already busy doing other jobs, such as backups, NetBackup will choose the next-least-busy media server to perform the requested action. For most situations, it will be best to configure only the media servers needed for these repetitive actions to have access to the disk pools in-use.
Plan disk pool deployment to maximize the throughput from the Oracle clients to multiple media servers and their NetBackup deduplication volumes. Take advantage of this new parallelism to improve throughput performance for both your backup and recovery operations for your Very Large Oracle databases to meet your strict RTOs.