While the AI hype has dictated change in many parts of today’s world, it has equally called for dramatic change in the realm of business. Endeavors toward modernizing data infrastructures have never been more crucial; for AI to succeed, it must be underpinned by a robust database and infrastructure that focuses on exceptional data privacy, governance, and quality while driving speed, scalability, and flexibility.
These widespread requirements inflict a heavy burden on the IT and data management professionals tasked with selecting the right database among the ever-growing array of database types—from graph databases to vector, time series, relational, in-memory, and so on.
To guide enterprises through the thicket of the latest database technologies and strategies, experts joined DBTA’s roundtable webinar, New Database Technologies and Strategies for the AI Era, offering their expertise on the market’s best solutions for supporting AI projects.
According to John de Saint Phalle, principal product manager at Precisely, data integrity is a crucial component in establishing a modern data infrastructure to properly underpin AI. This is particularly relevant to a business’ overall success, as executives continue to ask for new data points to evaluate the business—such as defining metrics and delivering against KPIs.
“Data integrity is all about building trust in your data. We know that every organization, every department, every individual wants to be data-driven in this day and age,” said de Saint Phalle. “[But] they need to adopt solutions that are really going to allow this data-driven future to become a reality.”
Trusted data pipelines that deliver valid, actionable data downstream are the lifeblood of modern data infrastructure. This is, however, easier said than done; innovation comes at a cost, bringing a certain tension between sustaining existing cultures while expanding into new strategies and technologies.
Precisely—the provider of a unique combination of software, data, and strategy services to deliver trusted data—offers organizations its expertise in key platform areas—including IBM infrastructure, SAP, and other relational database technologies—-so that innovation can alleviate tension. Establishing a robust infrastructure at the core of business supersedes a jump to any newfangled AI toy, as “regardless of what the [AI] trends are, it’s always about the data.”
Examining if your business is ready for AI is another vital aspect of modernization, according to Francois Protopapa, principal solutions architect at Elastic. Protopapa pointed to these four questions that evaluate whether an organization is truly AI-ready:
- Do you have all of your data consolidated?
- Do you have a secure data layer?
- Have you adopted semantic and hybrid search?
- Are you ready to use AI in a way that delivers business value?
Elastic, the company helping everyone find answers that matter from all data, in real time, at scale, offers the Elasticsearch technology stack that ensures that searching your proprietary data is more than just a search box. Each layer of the stack—from integration to storage and visualization—works to enable AI projects through observability, security, and search.
Protopapa further explained that Elastic is “not just a vector database…it is a complete engine which provides you with tools to go directly into artificial intelligence applications.”
Michael O'Donnell, senior analyst at Quest Software, contextualized data infrastructure modernization for AI by explaining that vector databases, among other AI-centric tools, aren’t new—they’ve just spiked in popularity. Yet there is a key differentiation to make; there are “enterprise databases” with vector capabilities—such as SingleStoreDB, PostgreSQL, and MongoDB—and vector only databases—such as Pinecone, Weaviate, and ChromaDB.
Making a choice among the seemingly endless options of databases becomes challenging, noted O'Donnell, especially when looking at a benchmark performance by SingleStore and Purdue Database Group. This research revealed that SingleStore performed better in several areas compared to native vector databases; with information like this, how can we expect developers to choose wisely?
To help in this decision, O'Donnell introduced two vector database comparison tables—both a collaborative spreadsheet managed by the community and an in-progress, to-be open source taxonomy of databases—that can shed some necessary light on the otherwise murky, proliferated world of modern databases.
Torsten Steinbach, VP chief architect, analytic/AI at EDB, emphasized the role that Postgres has to play in empowering AI proprietary success, stating that, “increasingly, you are seeing databases play a mission critical role for AI.”
To backdrop this argument, Steinbach explained that intelligence and AI are composed of the same parts. Intelligence is made up of the brain, consciousness, and memory, interacting with each other mutualistically. Similarly, AI has an equivalent “brain”—or the models and LLMs—"consciousness”—the AI application itself—and “memory”—the AI’s data.
Building AI solutions that effectively form those three components involve a myriad of tasks, including solution-specific development, prompt and context window management, model fine-tuning and serving, AI data capture, and more.
The solution to this extensive workload burden is an AI database, which takes care of more of a developer’s tasks—such as AI data prep, feature engineering, storage, retrieval, and running LLMs—while layering on top of the vector database, according to Steinbach.
EDB positions Postgres as the AI database to ease the breadth of requirements for enabling AI. For the full roundtable webinar featuring demos, a live Q&A, and more, you can view an archived version of the webinar here.