The time of mythical, inexplicable AI which operates seemingly “like magic” is over. Being able to imbue an AI system with ethics and transparency in its functionality is a fundamental component of an AI strategy in 2024.
At the annual Data Summit conference, Bharath Vasudevan, head of product, erwin, Quest, led the session, “Making AI Ethical & Explainable,” considered how consistent, clean, and curated datasets are key in bridging the gap between data and the real-world, enabling AI to transcend “magic” and ground itself in transparent truth.
The annual Data Summit conference returned to Boston, May 8-9, 2024, with pre-conference workshops on May 7.
Vasudevan focused on examining the state of AI where it is now, and how we can mature data for value.
“The state of AI today—it just has to get better,” noted Vasudevan. “Data is ultimately at the heart of this.”
They explained that the AI problem is beginning to intersect with data governance, where data curation and data trust is a fundamental component of AI success. This is especially relevant considering that the reality of AI is that that most AI initiatives fail, and due to the following reasons:
- AI is not app development
- ROI misalignment
- There’s too much data
- Can’t trust our data
- No proof of concept, just proof of confusion
To bridge the gap between data and the “real” world and properly fuel AI projects, Vasudevan imploresd attendees to cultivate the “4C’s” for datasets:
- Consistent: Trusted, value-scored, ready, and available
- Context: Context-based indexing and searching
- Curated: Business definitions, rules, processes, and context
- Clean: Watermarked, trusted, and free from drift
Notably, enterprises should strive to solve business problems directly and ground AI initiatives in that reality, not try to reinvent the wheel. “Don’t let perfection get in the way of better,” Vasudevan warned. “Projects that shoot for the moon fail because they don’t make anything better.”
Vasudevan pointed to Quest’s model-to-marketplace approach, where whether enterprises finish the climb or not, the goal is to enact incremental change and recognize where you are in the process and where you want to be. This approach includes the following steps:
- Model: Design data architecture.
- Catalog: Search and find data easily.
- Curate: Enrich data with business context.
- Govern: Apply business rules and policies.
- Observe: Raise data visibility and integrate data quality.
- Score: Automate data value scoring.
- Shop: Make trusted, governed data widely accessible.
Safe and responsible adoption of AI is a crucial theme that all companies will have to contend with. AI governance is coming into focus for regulators, where these entities are catching up with the rate of innovation that AI has displayed.
The best way to prepare for AI involves people and process implementation, not just technology. Vasudevan suggested creating a Center of Excellence (CoE) to manage the embedding of AI into the business. While it may slow innovation, it ultimately protects the business, explained Vasudevan.
The adoption of a data products approach also enables AI to be supported by trusted, contextual data. Pointing to Gartner’s recommendation for AI governance, they explained that “as analytics artifacts are delivered to consuming users, more must be taken into consideration than just the object itself. Organizations taking a product approach to analytics delivery will find increased trust across domains and reduced redundancies in development.”
Vasudevan explained that a data product is made of three components: the data modeler (access), the data catalog (ownership), and the data marketplace (value). Data products should also maintain these three attributes:
- Data products should be available, discoverable, and reusable.
- Data products should deliver business value to drive data valuation.
- Data products should assign responsibility to the people who maintain, monetize, and nurture the data product throughout its life.
Data products further prove the value of marketplaces, acting as a one-stop shop for data products. This empowers organizations to shop, share, and compare—or comprehensively access and assess data products from a centralized location. A data marketplace also aids in managing third-party data, affording enterprises greater transparency in costs, entitlements, and more, shedding light on an otherwise murky section of data.
With an easy-to-use marketplace, enterprises can combine the power of internal, external, and synthetic datasets—combined with best-in-class lineage and data value scoring—to democratize trusted data throughout your business.
Many Data Summit 2024 presentations are available for review at https://www.dbta.com/DataSummit/2024/Presentations.aspx.