Move data intelligently to AWS with Information Studio

More companies are venturing to the public cloud as an option to retain their primary and secondary data, safeguarding it from on-premises failures and site-wide disasters. Insight into the information contained in the files is essential to be able to determine which can be sent to the cloud.

Veritas Information Studio provides visibility into complex data landscapes that include multi-cloud architectures and diverse storage infrastructure. For organizations looking to move data to Amazon, Information Studio can provide valuable insights, helping users identify data that can be safely deleted, should remain on-premises, or can be moved to the cloud. It can also help identify what data should be sent to which storage classes offered by AWS—each with varying costs, availability and performance.  

It works by gathering information about data within a company’s on-premises and cloud-based data infrastructure, from both primary data sources and NetBackup, displaying the information in intuitive, dashboard-based categories.

Sandra Moulton – Image 1.png

It then allows users to filter information found for specific combinations. For example, users can choose to filter based on data age, like “older than 2 years,” and for data containing Personally Identifiable Information (PII) like phone numbers, credit card numbers and social security numbers. Users can even filter for data hosting a specific phrase, like “Highly Confidential” or a specific text string like a person’s name. Once important information is identified, reports can be extracted to allow evidence-driven data decisions.

For example, when evaluating your data assets, you might want to determine areas of waste so they can be deleted to reduce cost and risk. Information Studio can identify stale and non-business data from more than 20 on-premises and cloud-based data sources, including Amazon S3. Information Studio can then defensibly delete this data. If you choose to instead keep stale data and know it hasn’t been accessed in several years, you can choose to move it to low-cost storage like Amazon Glacier. 

You can also use Information Studio to determine which of your organization’s critical digital assets contain highly sensitive or personal data, regulated or confidential data. Information Studio comes with more than 700 preconfigured data classification patterns and over 120 classification policies that can identify these details within data. In the event it’s important to identify other specific pattern matches, custom policies can be created. Custom policies can contain any number of conditions, looking for text including multiple words, phrases or regular expressions, language, patterns (ex. IPv4 addresses), keywords and lexicons, NEAR and BEFORE operators to include proximity and use wildcards.  When classifying data, Information Studio reads text from the files and looks for matches to the conditions and patterns within the classification policies that are enabled and optionally also identities named entities. Each match found results in additional metadata about the file being stored in Information Studio. 

Sandra Moulton – Image 2.png

As an example, let’s say we have enabled the US Social Security and Taxpayer ID policy shown. During the classification operation on fileA, Information Studio identified a US social security number and a date of birth. This would result in Information Studio adding a metadata tag for US-SSN to the metadata it stores for fileA.

 

Once Information Studio captures this data, it allows users to filter the information to determine the appropriate storage location and retention.  Some examples of how you can use this information include:

  • You may filter to find all information that contains regulated data and choose to store that internally on WORM storage with a retention period of 7 years.
  • You may filter for data that is not highly sensitive or regulated and choose to store this on Amazon cloud storage.

Also, if data is subject to data locality regulations, Information Studio can help you to adhere with ease.  Users can place filters to search for sensitive or personal information about users in a specific country, and determine the data location. If something is non-compliant, users can see the file owner and address the situation. If they are planning to move the data, they can ensure it is stored in a compliant region. As an example, users can choose to store that in an Amazon S3 bucket in a specific region.

In summary, Information Studio can provide invaluable information to organizations using AWS cloud storage services, including the ability to...

  • Determine what data they can safely move to AWS cloud storage services with minimal risk and liability.
  • Make decisions about proper storage of primary and secondary data.
  • Reduce cost by identifying stale data within AWS that can be deleted.
  • Adhere to data privacy regulations about data locality.

If you are using NetBackup, see the AWS Cloud Storage with Veritas NetBackup white paper on using Information Studio to create NetBackup Service Level Protection Plans to move data to the appropriate Amazon bucket.

And for additional insight into Information Studio, see my Information Studio Technical Whitepaper.