Establishing data architecture components before embarking upon a data project “is a crucial step in understanding how the data will be used and how it will bring value to the business,” said Pey Silvester, vice president of engineering at Heavy.ai. “For resilient data architecture, it’s important for organizations to reference the big data architecture framework as a blueprint for data infrastructures and solutions. This framework defines how big data solutions should work, the components that must be used, how information will flow, and critical security details.”
Multi-cloud approaches also should be incorporated into resilient data architecture planning. “With data now living across multiple clouds, a resilient architecture must have the capabilities to help you protect data no matter where it lives,” noted Preston. “This includes a multi-cloud control pane built off three principles: no infrastructure, global policies, and self-service with central oversight. With self-service, you can delegate responsibility to the data and application owners while retaining centralized control.”
Storage is also an important part of the equation that often gets overlooked, said Rivero. “Underlying storage components must prioritize the quality of both the data itself as well as the accompanying metadata while having the ability to seamlessly scale or replace storage options,” he said. “These attributes are characteristic of cloud-based technologies that take most of the computing, processing, transfer, and storage activities off-site to secure, managed environments.”
In addition, from a technical perspective, an on-premise resilient data architecture should include “physically distributed resources and load balancing at each tier, and infrastructure monitoring, app monitoring, and service monitoring—CPU load, memory usage, query execution, interaction between different services and app components, and the number of bytes,” said Apshankar. At the cloud level, a resilient data architecture should be built on “an open and seamless data architecture that includes data preparation tools, data visualization tools, and agile collaboration tools.” The ideal approach needs to be “infrastructure as code, which includes multiple codes, data centers, environments, templates, and executions.”
OBSTACLES
Of course, building a resilient data architecture is not an overnight process. There are obstacles that will arise—not only technologically, but organizationally as well. “Data is often created in and constrained by internal silos that prevent real-time information from being widely shared. Without access to data, analytical efforts are duplicated across teams and otherwise rich data becomes redundant,” said Mehta.
IT teams are struggling to keep pace with today’s data resilience challenges “because most of them are still running backup systems that were designed for a different era,” Preston warned. On-prem backup systems require significant enhancement and security upgrades to function in today’s environment of ransomware attacks multiple times a minute, Preston said. “In addition, IT teams are significantly understaffed due to a global talent shortage. These two things together mean more things to do?enhancements to the backup system with fewer people to do them.”