cancel
Showing results for 
Search instead for 
Did you mean: 

Understand, Plan and Rehearse Ransomware Resilience series - Design to Recover

Rachelzhu
Level 1
Employee

Five Key Steps in Defining a Cyber-recovery Strategy

Cyber Incident Response Teams must be well-equipped with the right tools and knowledge to manage a ransomware attack and ensure the recovery of the services to meet the expectations of the business, its customers, and its regulators. The steps below explain how to build a plan, who are the stakeholders and what is the data protection strategies, policies and processes. 

 

Identify Critical Business Functions

Identifying critical business function will help you to restore your business quickly and efficiently in the event of a ransomware.

  • Identify the critical infrastructure and business services, determine the criticality of the data that is stored or processed by these services.
  • Prioritize systems and data in order of importance, and determine which systems and data need to be restored first in the event of a disruption.
  • Plan system recovery order timeline with the prioritized list. The timeline should take into account factors such as dependencies between systems, recovery time objectives (RTOs), and recovery point objectives (RPOs).

Identify Business Critical Functions with MTD

Recovery Time and Recovery Point objectives (RTO and RPO) are traditionally used as goals to frame a conversation for over-all recovery.  

  • Recovery point objective (RPO) is the maximum sustainable data loss based on backup schedules, data needs and system availability.
  • Recovery time objective. This is the amount of time an organization needs to bring critical systems back online. This is where disaster recovery activities typically occur.

Maximum Tolerable Downtime (MTD) is simply how long the key revenue-flow applications can be offline. MTD brings focus to the critical functions an organization needs to get up and running to avoid a business disaster.

Business function downtime is based on two elements: the systems or technology recovery time objective (RTO) and the people-based work recovery time (WRT). As such, the formula for maximum allowable downtime is the following:

Maximum allowable downtime = RTO + WRT

Orchestrate restore technologies: Based on the MTD and business critical functions, choose the appropriate restore technologies and orchestrate a flexible recovery strategy. 

 

Orchestration

NetBackup Resiliency Platform enables heterogeneous resiliency across on premise and cloud platforms.  Resiliency Platform supports automated replication, storage-based replication, or NetBackup’s built-in data mover and lets you choose the RTO and RPO that meet your application’s business requirements. 

Rachelzhu_0-1685813975694.png

Veritas NetBackup Resiliency’s Virtual Business Services allow users to manage recovery for multi-tiered applications as a single consolidated entity. With Virtual Business Services, users can completely automate the recovery of a complex, multitier application that spans multiple systems. In the event of a ransomware attack, this provides easier, faster recovery and minimal application downtime. Specifically, the Veritas Resiliency Platform provides tiered recovery orchestration by, for example, configuring virtualization and private clouds (e.g., adding VMware vCenter), NetBackup Primary Servers, networks (e.g., Network Pairing), physical servers, databases, etc. See Figure 9 for the completed resiliency group protection configuration.

Automation

NetBackup recovery platform solution supports automation with resiliency and evacuation Plans (runbooks), allowing you to automate recovery at-scale between data centers or to cloud infrastructures. The solution also allows for push-button rehearsed validation in isolated networks. In ransomware recovery scenarios, organizations can leverage custom scripts to integrate with third-party virus scanning solutions within the workflow to validate against malware prior to returning to production.

 

Rehearse a Plan

A detailed resiliency plan and rehearsal are 2 key elements in ensuring ransomware resiliency. During a stressful ransomware attack, the key stakeholders should know what to do and when to do it. The test plan will reduce the potential damage and help quickly restore operations. A rehearse is essential to test the success of data recovery and calculate recovery times. You need to periodically test all backup and recovery plans.

Recovery At Scale 

 

The key to a successful recovery strategy is flexibility to orchestrate the appropriate solutions to ensure operational and business resiliency for rapid recovery. Sometimes everything is impacted, and you may need to recover an entire data center in the cloud and on demand. On the other hand, maybe just a portion of your environment is impacted, so having solutions in place that let you grab individual databases and files to recover quickly to production can be crucial. In the case where entire servers become encrypted, you may need to quickly recover those servers elsewhere. Or maybe you just need to recover many VMs back to production. Veritas provides a wide range of technologies and capabilities to orchestrate, automate and rehearse a robust recovery strategy. 

 

Veritas offers a wide range of capabilities to help recover at scale, including:

  • NetBackup Resiliency: NetBackup Resiliency provides automated orchestration across an organization’s entire heterogeneous environment with a consistent user experience and visibility into the best recovery options based on the options available.
  • NetBackup Instant Rollback for VMware: Provides high-speed VM recovery by using Reverse Change Block Tracking to identify which unique blocks need to be recovered and applying just those changes to bring VMs back to a healthy state in seconds.
  • NetBackup Resiliency’s Continuous Data Protection (CDP) feature provides advanced resiliency for your applications by offering checkpoints derived from real-time replication of your production data that can be used for recovery purposes. CDP augments primary data replication by providing granular recovery for your VM’s with a near-zero RPO. This ensures recovery capability for your applications across heterogeneous environments using granular recovery points in addition to near-real-time data replication. With this functionality, CDP provides an additional layer of recoverability from malware or data corruption with a much lower RPO than recovering from a backup copy. NetBackup Resiliency also provides a single interface to manage recovery from both CDP checkpoints and backups, which simplifies the recovery process.
  • VM Recovery: Provides eight types of recovery for one backup of VMware VMs, including full VM, individual VMDK, file and folder, full application, Instant Access, file download, application GRT, and AMI conversion.
  • Instant Access for MSSQL and VMware: Provides almost instant machine recovery (e.g., 1,600 VMs) without waiting to transfer the VM’s data from the backup. Also provides the ability to test or recover VMs directly from backup storage.
  • NetBackup CloudPoint: NetBackup CloudPoint uses cloud-native snapshot technology in a cloud vendor‒agnostic way that allows easy protection of hybrid and multi-cloud infrastructures.
  • Universal Share and Protection Points: Allows organizations to provision deduplication-backed storage on the NetBackup server as secure shares, thereby protecting databases or other workloads where no agent or backup API exists.
  • NetBackup Universal Shares for Oracle: Allows Oracle database admins to start up databases directly from a NetBackup Appliance’s storage.
  • Long-term Retention Archive: Provides a cost-effective and durable solution that features deduplication and compression of data, including the use of object storage and private or public clouds with this method. Traditional Recovery includes granular restore of a specific file, full server/application restore, and disaster recovery (DR) restore to a different site location or the cloud. Using Veritas Resiliency Platform, organizations can automate and orchestrate traditional recovery with the push of a button, streamlining the DR process.
  • Bare Metal Restore: Automates the server recovery process, making it unnecessary to reinstall operating systems or con Veritas offers a wide range of capabilities to help recover at scale.
  • Veritas Alta Recovery Vault is seamlessly integrated with NetBackup and provides secure cloud Storage-as-a-Service managed by Veritas that is optimized for data protection. Recovery Vault provides:

 

• Ransomware protection – secure air-gapped storage reduces the threat of data loss from ransomware attacks

• Recover anywhere – flexibility to choose whether to recover workloads in your data center or in the cloud

• Optimization – storage management is provided by Veritas which gives you low-cost long-term data retention in the cloud that can complement tape as an additional option for air-gapped storage

Recovery Vault is an easy and cost-effective way to incorporate data protection tailored cloud storage into an overall ransomware resiliency strategy that helps reduce the risk of data loss, with limitless scalability that can easily accommodate growing data footprints.

Data Security with NetBackup Malware Scanning and Anomaly

NetBackup Malware Scanning provides greater control for detection and recovery workflows. NetBackup offers two malware scanning methods to protect your data's integrity and the backup image: on-demand scans and scans automatically triggered by high anomaly scores. The last-known-good image will be clearly visible in the recovery workflow, and selecting an impacted image will present several warnings to the user. If we find something infected in the immutable storage, the image cannot be expired before the minimum retention period, but in this situation, administrators will know there is an infection and can plan accordingly. Also, you can scan the image before the recovery. NetBackup will give warnings on detection before the restore. Malware Detection offers a powerful point of insight into the backup images as a response to an alert or on-demand scan of a backup image.

Next one I will discuss the day 1 actions.