Don’t Let Data Gravity Weigh Down Your Multicloud Initiatives

Large enterprises moving data to the cloud inevitably run into issues related to the concept of “data gravity.” IT expert Dave McCrory first coined the term data gravity to refer to the idea that as data accumulates, it builds a gravitational mass that attracts other services and applications to it, making it harder to move. The longer you run applications and services in an environment, the more data mass they accumulate and become “heavier.”

Accumulating vast amounts of data has opened up exciting opportunities. However, you have to consider the quality of the data you keep. Good data allows you to optimize your business – from better segmenting customers to customizing products more effectively. But redundant, obsolete, trivial, or even malicious data can exacerbate existing issues and limit success as you move to the cloud.

As you start to spread data, applications, and infrastructure across a hybrid or multicloud environment, various issues associated with data gravity might start to arise. Migration or modernization of applications becomes challenging and can be susceptible to latency issues. Moving data to, from, and in between clouds can be costly as some CSPs charge egress fees. And top of mind for most is the threat of a ransomware attack. If ransomware were to hit your data center or your cloud provider, it could impact a critically important data warehouse or data lake resulting in financial and reputational consequences.

Remember that your cloud data is your responsibility. If a worst-case scenario happens, you have little recourse against your cloud service provider under your end-user license agreement.

Let’s look at some potential impacts that data gravity can have on an organization:

  • Data gravity causes data protection challenges: A healthcare organization uses both on prem and cloud storage for applications containing sensitive medical records and patient data. With data growing every second coupled with the dispersed nature of storage, it’s difficult to keep the most updated data and patient records backed up. If a successful ransomware attack hit, this pertinent data could be lost, impacting patient care.
  • Data gravity results in data being poorly organized: A global automotive company collects large amounts of customer records and stores that data in the cloud. Overall data visibility becomes an issue and locating specific data becomes almost impossible. This data gravity can lead to compliance issues causing both financial and reputational repercussions.
  • Data gravity causes an organization to lose confidential data: A state government law enforcement agency is storing terabytes of body cam video across multiple public cloud environments. There are silos across these disparate cloud providers, and the agency’s data protection, management, and governance is poor. If a ransomware attack was to hit, access to that video data could be eliminated, potentially hindering an investigation or court case.

Data gravity not only makes data harder to move, but it can also make data less available, make it more complex to manage, and – most importantly for the scenarios above – can impact your visibility into mission-critical data. If you can mitigate the effects of data gravity – think of the limitless possibilities of data and analytics to drive business value and competitive advantage.

Countering data gravity

The best way to negate the effects of data gravity and drive your hybrid multicloud strategies forward is to confront the underlying issues head-on. Fortunately, there are several best practices to follow.

  • Gain a holistic view of your environment regardless of where data resides. Data gravity can cause volumes of data to build up in one part of an organization, which doesn’t share the data with other departments. Mission-critical data is moved to the public cloud or spread across multiple clouds and on-prem. This makes your company’s application and data storage structure highly complex and siloed. It’s difficult to understand how services, applications, and data stores interact. By gaining global, end-to-end visibility into your entire environment across geographies, clouds, and data centers, you will be better prepared and protected.
  • Assess where the areas of data gravity are strongest. This allows an organization to determine where it may run into problems with data migration or overall visibility. By determining how certain applications depend on certain data sources and how data has accumulated over time, organizations can also predict the effects of gravity in the future.
  • Classify all of your data to determine its business value. Automated data classification and file analysis will save you significant time and labor, and reduce errors. Automated classification protocols can help quickly tag and manage the most sensitive data first. You may be able to delete significant amounts of data from your on-premises and cloud storage resources, which not only reduces the gravitational effects but also helps protect your organization against some of the scenarios discussed earlier.

Data gravity may be the biggest hurdle enterprises face as they pursue a hybrid- or multicloud strategy, and overcoming it requires data-driven and fact-based decision making. The good news is that once organizations understand holistically how data storage resources and applications interact and depend on each other, they’ll be well on their way to achieving seamless operations and data visibility across physical, virtual, and cloud.

Read more of our series of blogs on taking responsibility for your data.

Part 1: Cloud EULAs are Complex – Why You Need to Read the Fine Print

Part 2: The Top Dangers Lurking in Your CSP EULA

Part 3: Your Data is Still Your Responsibility – Even in the Cloud