A major risk faced by organizations utilizing big data analysis is a legal request by opposing parties and regulators (e.g., for discovery or legal investigation purposes) for big datasets or its underlying raw data. It can be very difficult to maintain a limited scope related only to the legal issues at hand. This means the organization can end up turning over far more data than is either necessary or appropriate due to technical limitations for segmenting or identifying the relevant data subsets. Challenges associated with such issues are still new and thus there are no known industry best practices, and no legal authority yet exists. Though this is not good news for organizations currently using big data analysis that may be also implicated in lawsuits or other legal matters, there are ways to mitigate exposure and protect the organization as best possible, even now as this is still very much an unknown territory, from a legal compliance perspective.
Information security risks are also important factors to consider within the larger legal and risk context. If they are not mitigated early on, they alone can lead to opening the door for broader discovery related to big datasets and systems. Information security in a broad sense can include:
- Data Integrity and Privacy
- Encryption
- Access Control
- Chain-of-Custody
- Relevant Laws/Regulations
- Corporate Policies
Specific examples of situations where information security policies should be monitored include:
- Vendor Agreements
- Data Ownership & Custody Requirements
- International Regulations
- Confidentiality Terms
- Data Retention/Archiving
- Geographical Issues
Entering into contracts with third-party big data-related providers is an area that warrants special attention and where legal or risk problems may arise. Strict controls related to third-parties are important. More and more big data systems and technologies are supplied by third parties, so the organization must have certain restrictions and protections in place to ensure side-door and backdoor discovery doesn’t occur.
When dealing with third-party control, avoiding common pitfalls leads to better data risk and cost control. Common problems that arise include:
- Inadvertent data spoliation, which can include stripping metadata and truncating communication threads
- Custody and control of the data, including access rights and issues with data removal
- Problems with relevant policies/procedures, which can include a lack of planning and a lack of enforcement of rules
- International rules and regulations, including cross-border issues
For more articles related to big data, download DBTA's Big Data Sourcebook.