Newsletters




What it Takes to Leverage AI: A Look at Modern Data Architecture Needs


The enduring pursuit for AI technology has illuminated many areas of the enterprise that need serious work before any AI implementation takes place. Data architectures, for instance, require a thorough reexamination to determine how these infrastructures can be extended to support the intense demands—and possibilities—of AI.

In DBTA’s webinar, Modern Data Architecture for AI, John O'Brien, principal advisor and industry analyst at Radiant Advisors, led experts through a thorough discussion of how to cultivate the right data infrastructure to support AI initiatives, from leading best practices to key technologies.

Behavioral data is as uniquely valuable as it is uniquely reflective of each person interacting with the enterprise, according to Yali Sasson, co-founder and chief product officer at Snowplow. As both explanatory and predictive in nature, behavioral data has the potential to fuel analytics, machine learning (ML), and GenAI use cases that transform business success.

“Suddenly, as data people, we could collect very granular data that describes exactly how each of these individuals were doing whatever they were doing and we could really start to understand those journeys,” said Sasson. “We could collect that data at scale across potentially millions or billions of people, depending on the scale of the digital service.” 

While behavior data has the capacity to catapult many businesses into the modernity they aspire to, most organizations lack the current infrastructure needed to harness it. According to a report from Gartner, some of the top data barriers for AI adoption are data accessibility challenges, data scope or quality problems, and data volume and/or complexity.

Snowplow works to solve these data problems so that developers can leverage behavioral data to build AI and GenAI applications, explained Sasson. With a platform that delivers high quality, trustworthy behavioral data in real time, developers can power AI features directly while driving systematic customer intelligence. Snowplow’s behavioral data—which is delivered with its semantic metadata underpinned by a knowledge graph—is easily understood by AI and routinely checked for data quality issues both in before and during production.

Don Doerner, technical director at Quantum Corporation, centered their discussion on how Quantum not only helps customers store, curate, and manage their data, but—and perhaps most importantly—use their data to produce tangible value. By broadening their CatDV Asset Management Platform to drive data classification and enrichment, Quantum aims to assist enterprises in cleaning and using their data so that it can be best leveraged by AI and ML technologies.

Doerner emphasized that while AI is often seen as a complex endeavor, “it’s just another workflow,” where data travels through different stages to power AI use cases. In the first phases of this journey—data identification and data preparation—classification, enrichment, down-selection, cleaning, pre-processing, and feature engineering are necessary to feed the AI model the best data.

“If you teach an artificial intelligence model using flawed data, you will get flawed results,” said Doerner.

Ultimately, once you reach model deployment, the data needs are dependent on the AI model being used. “The data was used to train the model—the model is now used to interpret the event stream,” noted Doerner. For example, in order to use retrieval-augmented generation (RAG) to improve the accuracy of a large language model (LLM), it requires the implementation of vector databases and vector embeddings, dictating an enterprise’s specific data needs.

Vendors need to live in the customer’s world, Doerner emphasized, understanding the flow and innate intricacies of data to effectively solve their problems. And, often, a multi-vendor ecosystem will be the resulting architecture to achieve the AI use cases desired within the context of an organization’s infrastructure.

For the full, in-depth discussion featuring development ecosystem breakdowns, a Q&A, and more, you can view an archived version of the webinar here.


Sponsors