Hurdling into data modernity—which forces enterprises to reckon with distributed applications, diverse data sources, data bottlenecks, increasing costs, data latency, vendor lock-in, and more—presents as many challenges as it does opportunities. Within this balance, ensuring that your workers have access to data exactly when it's needed is a vital piece of the modern puzzle.
David Loshin, principal consultant at Knowledge Integrity and James Goodfellow, director of product marketing at Progress joined DBTA’s webinar, 6 Challenges of Managing Information in a Modernized Hybrid Data Architecture to explore several major roadblocks that enterprises should consider when adapting to a hybrid data landscape and architecture.
Loshin’s approach to this topic—which is sourced from his paper on the same subject—boils down to “the need to really understand what modernization means from a practical perspective and how to differentiate that from…what might be deemed as a migration effort.” Focusing more on business drivers, rather than technological implementation, is a key component to his perspective on modern data environments.
Loshin identified the following as six major challenges in managing information in a modernized hybrid data architecture:
- Data landscape complexity
- Data quality requirements
- Real-time integration
- Data fusion
- Unified data governance
- Legally mandated compliance
Beginning with the first challenge, Loshin explained that data landscape complexity is ultimately due to the increased distribution of data across a variety of file structures, platforms, systems, and providers. This complicates the ability for users to access said data, as well as limiting the delivery of harmonized views.
The remedy, Loshin posed, lies within a managed connectivity solution that eliminates the data engineer’s need to be familiar with the many potential interfaces for data access. He further emphasized that having the systems in place that both ensure data accessibility and confidence in the data it surfaces is crucial.
Data quality requirements—or the “ever-present and perennial struggle to achieve data quality,” as Goodfellow put it—pose similar intricacies as data landscape complexity. Accessing data becomes even more difficult when that data derives from multiple sources that are created under different circumstances and managed within distinct process flows.
“What’s fascinating to me is that the problems that organizations face are still not addressed—we’ve been talking about this for decades and yet people are still struggling with data quality,” pointed out Loshin.
The reason? A hyper focus on the technologist perspective of data quality—those creating the tooling—and a lack of attention on the data user perspective—those who ascertain whether the data is useable based on what they need to do, according to Loshin. Each data source adds another magnitude of complexity, and if the quality of that data is poor, it can be detrimental to the entire data estate.
To bring more focus on what data users need as far as data quality, Loshin recommended leveraging managed connectivity, data catalogs, and other tools that allow engineers to partner with data consumers to solicit their data quality and usability requirements, as well as define data validation policies that can be deployed within data pipelines..
Real-time integration, according to Loshin, is a combination of data distribution and use of different tiers of cloud-based storage infrastructure that introduce increased latency into data requests. He further offered that pipeline management and orchestration—when used in conjunction with virtualization to identify sources of data delays—can allow for caching and streamlined access to data, inherently optimized to minimize any latency.
To view the full discussion of the six major challenges in managing information in a modernized hybrid data architecture, you can view an archived version of the webinar here.