Showing results for 
Search instead for 
Did you mean: 

A Map from Reactive to Proactive Data Discovery

Level 2

For the last ten years, we've all been at conferences or events where you can listen to presentations about how proper preservation and better eDiscovery data collection efforts can help optimize discovery. However, if you take into consideration data growth, the complexity of data types (social media, email, etc.), growing regulatory requirements (Data Privacy), and data location challenges (multicloud strategies), it's clear that the complexity of discovery is not shrinking, despite the fact we have better eDiscovery tools than ever at our disposal.

We believe it is time to discuss a proactive data discovery approach with your legal IT team, supporting targeted eDiscovery processes and modern tools. Like the idea of having a 'clean' and 'organized' garage, right? A garage in which you can find stuff and throw away unnecessary items from time to time. Your approach to digital collections should be no different. You can be more proactive with your data management and reap the rewards when you need them most.

Once upon a time, litigation, investigations, regulatory requests, and other legal events constituted one-off fire drills by in-house and retained counsel. Modern corporate legal departments want preservation or discovery requests handled with standardized, efficient business processes. The key to rolling back your 'data fog of war' is a proactive data map that visualizes data sources, data types, business units, and custodial data ownership. For too long corporate legal has relied on custodians and IT administrators to scope holds and collections without having visibility into the context and content of targeted sources. Retained counsel have resisted selective collections and insisted on full custodial data dumps because they lacked confidence in their overall understanding and accessibility of diverse enterprise data sources. Sprawling file shares, SharePoint/Team sites, and other unstructured repositories pose unique challenges to assigning ownership and business context for discovery scoping.

How reactive discovery put counsel behind the curve:

Traditional reactive discovery scenarios involved interviewing key custodians or IT admins to build a list of potentially relevant data sources manually. Counsel had no direct access to data sources or insight into their context, volume, or content. This blind dependence raised the risk of missing key evidence or massive over-collection of raw custodial data. Collecting, processing, hosting, and reviewing these reactive custodial data dumps can cost >$1,500 per GB[1] when only a small fraction will be produced.

eDiscovery Costs Are Significant - Source: eDiscovery Pricing Survey 2019 ( Costs Are Significant - Source: eDiscovery Pricing Survey 2019 (

A targeted and proactive, mature discovery workflow gives counsel visibility into and collection access across the enterprise without support tickets and 3rd hand accounts. Being proactive requires investment in solutions that identify, analyze, and report to keep your data map constantly up to date and allows you to have an effective targeted collection strategy. This enables counsel to interview or survey potential custodians armed with reports on their data sources, types, volumes, and more. Instead of chasing after forgotten file shares or collecting local files already replicated to the custodian's OneDrive, the modern counsel has defensible checklists and sampling collections to make Rule 26 production certifications with confidence.

I thought I already had a legal data map:

Many corporations have conducted point in time data mapping exercises that captured a snapshot of systems for divestitures, FTC 2nd requests, interrogatories, and audits. The names, types, locations, and other broad system parameters were entered into spreadsheets, databases, or more recently into SharePoint lists. These prototype data maps were out of date before they were completed and rarely included user ownership/access data. These early static data maps still provided value when most corporate data resided within their data centers or corporate-issued hardware. The diversified and dynamic nature of modern business data sources requires solutions that actively monitor local, enterprise, and cloud data sources.

Building a Veritas dynamic data map:

Legal, Compliance, and Security users need to understand the relationships between data, users, and business roles. This contextual information lives across siloed systems including:

  • Active Directory/LDAP
  • HR system
  • Corporate organization chart
  • IT asset tracking systems
  • Mobile Device Management systems
  • Endpoint or DLP systems
  • Records management systems
  • Content Management systems – SharePoint, Documentum
  • Communication systems – Email, ChatBuilding a Veritas dynamic data map.png

The input from these systems creates a high-level list of structured data sources associated with a given custodian. But what about the vast sea of unstructured file shares, SharePoint sites, and legacy repositories? Identifying potentially relevant unstructured data requires analysis of ownership, access, and user actions to build custodian context. Veritas Data Insight's file analysis illuminates dark data repositories. It assigns inferred ownership based on your parameters and understands user business role context.

Veritas Information Classifier can leverage 700+ pre-configured classification patterns, 120+ common policies, automated OCR, and machine learning to optimize your discovery process. Imagine excluding privileged documents from collections or segregating confidential/trade secret documents for special handling. Minimize the time and risk when scoping legal holds or discovery collections with rich content reports on your custodial data sources.

Veritas eDiscovery Platform is corporate Legal's self-service portal to manage legal holds, collections, and review. Data Insight and the Veritas Information Classifier support Legal's dynamic data map and optimized selective collections.

  [1] Source: eDiscovery Pricing Survey 2019 (