Newsletters




New Data Horizons: Data Prep, Data Visualization, and Data Catalogs Are Ready for Prime Time

<< back Page 3 of 4 next >>

CHALLENGES AHEAD

As with any business or life decision, it’s important to look before you leap. The challenge with data democratization is the sheer volume of data created that needs to be managed. The new generation of tools and platforms may solve the big data problem, but they also “lower the barrier of entry for advanced analytics for more organizations, and empower more consumers to use data,” said Sumit Sarkar, senior director of product marketing at Immuta. “This results in a new bottleneck from increasing users and rules for data use. While our tools are now better, smarter, and faster, the challenge is that there is so much data. And the complexity of storage and retrieval only increases as more users are added.”

Even if there is the ability to handle large volumes of data, “some functions haven’t changed much in 4 decades,” said Tricot. “Ultimately, we need more advances in data integration, which takes up a big portion of operating costs.” Collaboration on data projects is still an area “which is clearly vital but still primitive. We make copies and more copies of numerous documents and datasets, leading to versioning nightmares and potential non-compliance or security vulnerabilities with each new copy. It’s even stranger that it’s almost impossible to separate data from the applications used to generate, collate, or analyze it. Meanwhile, most applications have their own language and data model—a constant obstacle to developers of new applications that need to use that data.”

In addition, many of today’s data integration and enablement tools “still lack essential enterprise capabilities such as a data platform built for AI and, more importantly, data governance,” said Shah. “Effective data governance is a complex problem, and most large enterprises choose a multi-vendor approach for their cloud strategy. This creates an ever-growing list of data, reports, models, and other analytic assets. The topmost priorities of enterprise data stewards are the discovery of these data assets, providing context to the exploding data footprint, cataloging, data lineage, and data protection.”

Data enablement tools and solutions “have yet to exploit the full power of AI and data science disciplines, tools, and technologies,” said Rajagopalan. “They need to be augmented with automation frameworks and data governance standards. The successful implementation of data enablement tools requires a cultural shift and widespread data literacy across the organization, which goes well beyond the scope of such tools and platforms. We foresee a convergence in the data enablement space toward simplifying tools and technologies that are tightly integrated with data governance and standards, cloud platforms, enterprise automation tools, and open source AI, machine learning, and deep learning frameworks.”

In the months and years ahead, there will be “a growing need for data ethics and privacy in the absence of a universal framework,” Shah observed. “As enterprises collect more and more data, new data platforms must provide a way to identify personal and confidential data as an augmented offering.” Look for platforms “that can bring order to the growing data chaos and support evaluating the fitness of data in an efficient, governed, and secure manner,” he advised.

There is an increasing need for global cataloging and governance across varied cloud and on-premise data environments. “The ability to identify, locate, and trace datasets is still a key capability requirement, and this is exacerbated in a hybrid multi-cloud world,” said Anu Mohan, director of product, data integration and management for Vantage Cloud. “Access seems to have improved, in that most tools expect to just read a file or use very basic SQL to import data directly into the data consumption tools. But the non-standard elements of access make it harder. If you have data on-prem and in AWS, you have difficulties. If you switch from AWS to Azure, everything changes. The portability of data, the portability of applications, and the ability to manage across these environments are all challenges to be solved in the next 5 years.”

What’s also missing from most data enablement today “is the capability for organizations to secure the use of data in real time,” said Beecham. “Data enablement is insufficient when its focus is only on the movement of data, and not on how it’s accessed, protected, and transformed within use. Even with a data enablement plan, users can find themselves trapped in separate silos for security, governance, DataOps, analytics, and more—without an approach to data enablement that prioritizes integrated control and security. What’s needed is functionality that secures access to data, especially PII [personally identifiable information], not only when it’s at rest but also when it’s in transit, updated, or cleaned.”

Beecham predicts, however, that making data enablement work as part of a holistic approach to data security and governance is on the horizon and will be available within the next couple of years for the data industry. “Data visibility, control, and security are not just nice-to-haves, or requirements for compliance. They’re the foundation for understanding how data is used across the organization and for making it possible to get the full benefit of business data. When the ability to control and secure data is available across a complicated IT environment, it makes understanding and maximizing the use of all data easier and more valuable for businesses.”

<< back Page 3 of 4 next >>

Sponsors