Data Genomics Index

ChrisTalbott · ‎07-19-2016

The recently released Data Genomics Index is the first time the industry has really gotten an answer to the question: What is my storage environment made of?

And the answer is far from comforting for information managers.

From true growth metrics, to file type composition, to the average age distribution and size proportions of their individual files this report gives the average characteristics of your typical environment. In this report, Veritas analyzed 10’s of billions of files and their attributes directly from many of our customer’s unstructured data environments in 2015 to provide the first accurate view into what the average environment looks like.

It’s growing incredibly fast, file type composition is not what you would expect, it’s stale, it’s noisy and it’s been cluttered by former employees. Data is growing 39% year over year, images, developer files and compressed files take up more than a third of the environment; 41% of the environment hasn’t been modified in the last 3 years; the average petabyte has 2.3 billion files in it; and 5% of the environment is orphaned.

While these metrics may seem simple, they’re very powerful when facing tough information management decisions at increasing scale. One thing we hear all the time from our customers is they’re struggling with two competing forces of nature – the exponential data growth curve, and the restriction of resources and budget to fight it with new servers and applications.

But they’re staring at an innumerable number of decisions and paths in order to change that crippling growth dynamic. They have to act intelligently and devote remediation effort to get the best return.

As we’ve taken the Data Genomics Index on the road throughout the globe, one thing is clear – organizations are tired of just buying infrastructure to solve this problem. Many are looking to evolve to become more data-centric in their data centers, deciding what information to put where based on its actual value to the company. And they’re looking to the larger industry for best practices when doing so.

We believe the Data Genomics Project is a chance to do this, to change the way we think about managing information at its core. This idea of a consistent data profile, an environments “genetic” make-up, and interpreting it for the multitude of Information Management decisions it would be useful for, is an interdisciplinary discussion that we can contribute heavily to, but needs many other voices.

Veritas is founding the initiative and gathering a community of likeminded data scientists, industry experts and thought leaders driven to surface the true nature of enterprise environments, build the data-genome that matters for information management and share the discussion with a world struggling to solve tremendous data growth challenges.

If you want to join the cause direct message us @datagenomicsproj or email chris.talbott@veritas.com

VOX

Data Genomics Index: What is My Storage Environment Made of?