Follow us on #DataSummit
View the Advance Program [PDF]. Or, see the Agenda page for a grid view.
Data Summit 2025 is a unique conference that brings together IT practitioners and business stakeholders from all types of organizations. Featuring workshops, panel discussions, and provocative talks, attendees get a comprehensive educational experience designed to guide them through all of today’s key issues in data management and analysis. Whether you're fascinated by the technical potential and complexities of emerging technologies or focused on leveraging Big Data for business intelligence, analytics, and strategic decision-making, we've got you covered!
Access to all tracks including AI & Machine Learning Summit plus Generative AI Boot Camp and Data Engineer Boot Camp is included when you register for an All-Access Pass or Full Two-Day Conference Pass. Attendees may switch between tracks as they choose. Only interested in the 2-day AI & Machine Learning Summit or our 1-day Boot Camps? Standalone registration for this content is also available.
Tuesday, May 13: 9:00 a.m. - 12:00 p.m.
Every organization faces unique challenges in becoming data-driven. This practical half-day session guides attendees through creating a modern data architecture that aligns with your business strategy to deliver ongoing and scalable value. Through our proven four-step methodology, attendees learn to translate business goals into architectural decisions and evaluate emerging technologies from cloud-native platforms, data lakehouses, and data fabrics to build a prioritized road map. Data leaders gain frameworks to assess which modern data stack components best serve their organization's specific needs and capabilities. Most importantly, attendees leave with actionable insights that will motivate them to transform their data infrastructure and drive business outcomes.
John O'Brien, Principal Advisor & Industry Analyst, Radiant Advisors
Tuesday, May 13: 9:00 a.m. - 12:00 p.m.
Semantic layers stand out as a key approach to solving business problems for organizations grappling with the complexities of managing and understanding the meaning of their data. A semantic layer, also called context layer, is a business representation of data that allows organizations to quickly map various data definitions from multiple data sources to familiar business terms, offering a consistent and consolidated view of data. Join our workshop to gain insights into the foundations of semantic/context layers, their implementation, and the business value they provide by enhancing the utility of your data. The workshop promises an interactive experience, offering participants the opportunity to both understand the nuances of semantic/context layers and actively engage in constructing one.
Joseph Hilger, COO, Enterprise Knowledge, LLC
Sara Nash, Principal Consultant, Enterprise Knowledge LLC
Tuesday, May 13: 1:00 p.m. - 4:00 p.m.
Data teams are deluged with user requests for datasets, metrics, or reports. They need help finding, accessing, validating, or fixing data. For many data leaders, the solution is simple: Empower users to service their own data and analytics needs. But what is easy to say is challenging to do. This workshop provides practical, time-tested approaches to democratizing data and creating an insights-driven culture. It shows how data teams can eliminate data bottlenecks by transforming themselves from order takers to strategic business partners who proactively anticipate business needs. However, the path to self-service nirvana is not for the faint-hearted: It requires developing a deep knowledge of business user needs and then overhauling the team's operating model, data architecture, data governance, data delivery, and support networks to meet those needs.
Wayne Eckerson, President, Eckerson Group
Tuesday, May 13: 1:00 p.m. - 4:00 p.m.
While fine-tuned large language models (LLMs) excel at simple coding tasks, they often struggle with complex code, specialized libraries, and detailed applications. This workshop addresses the challenge of LLM hallucinations by introducing a method of breaking down training data into smaller, descriptive components that build sequentially into complex code. Learn how this structured approach enables even smaller LLMs running on consumer-grade GPUs to perform significantly better. Jacob shares real-world experiences in fine-tuning code generators for modern AI libraries such as LangChain and Vertex AI. With enhanced dataset representation, models produce more accurate, library-aligned code. He covers practical strategies for improving LLM performance in specialized code generation tasks. Attend this workshop to learn how to improve LLM code generation through innovative dataset representation strategies.
Shomron Jacob, Head of Applied Machine Learning and Platform, Engineering, Iterate.ai
Wednesday, May 14: 8:00 a.m. - 8:45 a.m.
Wednesday, May 14: 8:45 a.m. - 9:30 a.m.
In the factory-driven Industrial Revolution, we began to view and measure work as a process. Now, in the AI Revolution, we will need to adopt a different model, where we view and measure work as a story. Building on the neuroscience that makes us wired for story patterns, storytelling uses “story” as a communication strategy, while story thinking uses “story” as an operational strategy. The volume, velocity, and variety of data will be connected to processes but also to the organization’s overall narrative intelligence. Lewis discusses the implications of data visualization through the lens of story visualization, which requires understanding human beliefs and commitments, and provides examples for leadership, change, innovation, healthcare, and organizational design.
John Lewis, CKO, Explanation Age LLC
Wednesday, May 14: 9:30 a.m. - 9:45 a.m.
Wednesday, May 14: 9:45 a.m. - 10:00 a.m.
Wednesday, May 14: 10:00 a.m. - 10:45 a.m.
Modern Data Strategy Essentials Today is your guide to the key principles data-driven companies are applying to achieve success in our increasingly complex world of data sources, types, applications, requirements, and user expectations. Attend this track to learn how to align technology, people, and processes with the complete data journey and the capabilities that support your current and future needs.
Designed for chief information officers, chief data officers, digital transformation leaders, IT business liaisons, enterprise architects, data architects, data engineers, and data management and analytics professionals.
Wednesday, May 14: 10:45 a.m. - 11:45 a.m.
Data is the cornerstone to becoming insights-driven.
Gade explores key strategies for implementing scalable data models, optimizing performance, and ensuring regulatory compliance across various domains, including finance, retail, and manufacturing. He provides insights into practical methodologies for designing efficient data lakes, leveraging cloud technologies such as AWS, and utilizing tools such as Erwin, Hive, and Informatica. The discussion also highlights case studies on successful migrations, such as transitioning from legacy systems to AWS Data Lakes, integrating real-time analytics with ML models, and the application of industry-standard data models for business intelligence.
Kishore Reddy Gade, VP, Lead Software Engineer, JPMorgan Chase
Traditional MDM (master data management) falls short in today's AI-driven world. Smtih introduces Syncari's Agentic MDM, which transforms MDM into an intelligent, scalable data foundation. Learn how to enable real-time AI integrations, adaptive governance, and data trust to drive business impact.
Jack Smith, Principal Solutions Engineer, Syncari
Wednesday, May 14: 12:00 p.m. - 12:45 p.m.
To excel at maximizing your data resources, you need to have a workable strategy in place.
Cooney provides a set of real-world recommendations for the successful development of enterprise data strategy and practice in your organization. Based on more than 20 years of working with data technologies and with more than 10 years in the cloud and AI, he shares some key observations from that experience that will help you stay grounded in the flux of today's data and AI market. The five secrets will be revealed in this session.
Pete Cooney, Enterprise Lead Data Architect, Jackson National Life
Wednesday, May 14: 12:45 p.m. - 2:00 p.m.
Wednesday, May 14: 2:00 p.m. - 2:45 p.m.
Having data is one thing, but having a resilient strategy to manage it in an AI world is a decisive factor for success.
In today’s fast-paced digital landscape, a modern organization’s ability to effectively manage its data is a decisive factor in its success. As organizations strive to harness the power of AI and advanced analytics, maintaining high-quality, reliable data has never been more critical. Vasudevan explains how organizations can build resilient data strategies that improve data quality, visibility, and trust while reducing development time, enhancing decision making, and fostering collaboration.
Bharath Vasudevan, VP of Product Management, Quest Software
Wednesday, May 14: 2:45 p.m. - 3:15 p.m.
Wednesday, May 14: 3:15 p.m. - 4:00 p.m.
People and the culture of organizations affect how decisions are made.
Hicks discusses how the relationship between organizational decision-making styles and data analytics maturity levels impacts organizational data culture. She highlights the significant impact of fostering an organizational data culture on productivity, profitability, and operational efficiency, emphasizing the importance of data-driven decision making, the components necessary to achieve data analytics maturity, and breaks down data siloes. She offers practical recommendations for driving organizational change.
Suzannah Hicks, AI Program Architect & Strategist, Hummingbird Healthcare
Wednesday, May 14: 4:15 p.m. - 5:00 p.m.
Understanding the people involved in decision making around implementing AI within an organization leads to partnership building.
As AI transforms from a technological novelty to a business imperative, enterprises face a critical challenge: how to effectively introduce and scale AI initiatives across their organizations. Keane challenges the traditional siloed approach to AI adoption by advocating for a powerful trinity of leadership—where the chief technology officer (CTO), chief data & analytics officer (CDAO), and chief financial officer (CFO) form a strategic alliance to drive AI transformation.
Matt Keane, VP, Data Science, AI, Innovation, Aflac
Wednesday, May 14: 5:00 p.m. - 6:00 p.m.
What’s Next in Data and Analytics Architecture drills down on shifting trends and emerging best practices that are helping companies achieve more flexible, modular, and distributed data infrastructures to support modernization and innovation. Attend this track to gain a deeper understanding of the new technologies and strategies driving greater speed and scale, and improved governance and security, at organizations hungry for fast, actionable insights.
Designed for chief information officers, chief data officers, enterprise architects, data architects, data engineers, data scientists, and data management and analytics professionals.
Wednesday, May 14: 10:45 a.m. - 11:45 a.m.
When considering the many aspects of data architecting, a few best practices stand out.
Architectural best practices to build and scale the most effective GenAI and agentic AI applications, optimizing on cost and performance, lead to an unparalleled customer experience. Allowing enterprises to cost-effectively deliver personalized and dynamic online customer care applications that captivate their final end users involves leveraging a serverless approach and orchestrating agentic AI.
Vandana Saini, Senior WorldWide Generative AI Specialist Solutions Architect, AWS
Wednesday, May 14: 12:00 p.m. - 12:45 p.m.
The future of data lakehouses encompasses many elements, including catalogs.
As the industry embraces the lakehouse paradigm and the variety of table formats such as Iceberg, Hudi, and Delta, the next key challenge is understanding the role of lakehouse catalogs. These catalogs govern and track your lakehouse assets, providing essential metadata and ensuring smooth management across different computing engines. Merced demystifies leading catalog solutions such as Apache Polaris (incubating), Nessie, Unity Catalog, Gravitino, Dremio Catalog, and AWS Glue and guides you through navigating this evolving landscape to effectively manage your lakehouse.
Alex Merced, Senior Tech Evangelist, Dremio
Wednesday, May 14: 12:45 p.m. - 2:00 p.m.
Wednesday, May 14: 2:00 p.m. - 2:45 p.m.
AI technologies are having an immense impact on organizations, including cloud databases.
Learn about the latest trends in database cloud modernizations, including the cloud-native and vector databases offered by AWS and Azure. Learn essential tips to help you pick the right database for your modern application and analytics needs, including use cases and comparing IaaS and PaaS databases, vector databases, and GenAI.
Michael Agarwal, Director & Practice Lead of Cloud Database Services, Datavail
Wednesday, May 14: 2:45 p.m. - 3:15 p.m.
Wednesday, May 14: 3:15 p.m. - 4:00 p.m.
When it comes to modern data architectures, data mesh continues to play a critical role.
Balancing centralized control with decentralized data management in a modern data mesh architecture remains a challenge. As organizations adopt decentralized data strategies, governance becomes both more essential and more complex. Lu advocates for a hybrid approach, where central platform teams take on a governance role, providing a suite of tools that empower various parts of the organization to develop and manage their own data products effectively. This setup allows for decentralized data ownership, yet ensures there’s a cohesive framework in place to maintain data quality, consistency, and trust across the board. The goal is to foster autonomy within teams while upholding a unified approach to data integrity and governance.
Hugo Lu, CEO, Orchestra
Wednesday, May 14: 4:15 p.m. - 5:00 p.m.
Data warehouses are an integral part of modern data architectures.
In discussing future-proofing data warehousing, Vayyala considers the impact of data modeling, medallion architecture, and integration. Building an enterprise data warehouse is more than just a technical challenge; it’s a strategic investment in a company’s future. Implementing a robust, scalable, and flexible EDW ensures that an organization has the data it needs to make informed, data-driven decisions.
Rajesh Vayyala, Principal Data Architect, Data Architecture and Design, PRA Group Inc.
Wednesday, May 14: 5:00 p.m. - 6:00 p.m.
By identifying patterns in vast sums of data and creating human-like content at lightning-fast speeds, GenAI applications have emerged as a powerful tool for automating and optimizing a wide variety of tasks. Although adoption is still in the early stages, many organizations are currently testing and deploying GenAI applications in pursuit of greater efficiency and productivity. At the same time, succeeding with GenAI requires overcoming a range of challenges—from legacy infrastructure and skills shortages to governance and security risks, data quality issues, and trust and transparency concerns. Attend this boot camp to dive into the key technologies and emerging best practices.
Designed for chief information officers, chief data officers, data architects, data engineers, data scientists, and AI engineers and developers.
Wednesday, May 14: 10:45 a.m. - 11:45 a.m.
Supercharging customer experiences is one aspect of GenAI that holds real promise.
Gudla looks at two innovative approaches designed to improve grocery search results by enhancing both relevance and discoverability, with a focus on the development and application of a new product relevance classification model, alongside the strategic integration of LLMs to improve discoverability of novel products. By leveraging the precise categorization capabilities of the ESCI model and the contextual understanding provided by LLMs, Instacart could anticipate and meet consumer needs more effectively. This ultimately led to increased engagement and incremental revenue.
Vinesh Gudla, Staff Machine Learning Engineer, Instacart
Wednesday, May 14: 12:00 p.m. - 12:45 p.m.
Knowledge graphs are key to unlocking the power of retrieval-augmented generation.
AI’s "disillusionment" phase isn’t an AI problem—it’s a data problem, one that knowledge graphs can solve. They guide AI with precision and context, ensuring a clear path toward trustworthy AI. They prevent wrong turns by organizing and linking data in semantically contextual ways and ensure models don’t just process data, but do it accurately, reliably, and contextually with relevance to limit hallucinations. Pal discusses how knowledge graphs help improve data quality, mitigate AI risks, reduce costs, and prepare enterprises to be AI-ready to reap ROAI (Return on AI Investments).
Sumit Pal, Strategic Technology Director, Graphwise.ai
Wednesday, May 14: 12:45 p.m. - 2:00 p.m.
Wednesday, May 14: 2:00 p.m. - 2:45 p.m.
An important component in gaining trust in GenAI models and implementations is retrieval-augmented generation.
RAG is an important tool not only for overcoming the limitations of LLMs but also to optimize GenAI itself. Zeiler explains the architectural and technical components required for production-grade RAG systems, addressing critical considerations such as embedding generation, vector store capabilities, and LLM selection. Learn how input/output modalities influence design decisions and how to navigate edge cases as you scale complexity. Discover practical insights and innovative applications of RAG, along with lessons learned from real-world implementations.
Matt Zeiler, Founder & CEO, Clarifai
Wednesday, May 14: 2:45 p.m. - 3:15 p.m.
Wednesday, May 14: 3:15 p.m. - 4:00 p.m.
The possibilities inherent in introducing GenAI into organizations are exciting but may not address every issue.
GenAI is an exciting and useful technology that is adding value to many enterprise applications. Compelling as it is, GenAI is not always the correct solution for analyzing unstructured data. Sometimes other forms of AI and ML are better-suited to the job. For example, GenAI is great for summarizing the findings of a collection of research documents, but non-generative AI can surface and recommend other documents related to topics of interest. Seuss describes and demonstrates how AI in all its various forms can be combined to analyze unstructured data.
David Seuss, CEO, Northern Light
Wednesday, May 14: 4:15 p.m. - 5:00 p.m.
It's tempting to think that GenAI will sell itself, but making the business case for it is still required.
In the modern business landscape, AI and data strategies can no longer operate in isolation. To drive meaningful outcomes, organizations must align these critical components within a unified framework tied to overarching business objectives. Crolene explores the necessity of integrating AI and data strategies, emphasizing the importance of high-quality data, scalable architectures, and robust governance. He outlines three essential steps: recognizing that AI requires the right data to succeed, prioritizing data quality and architecture, and establishing strong governance practices. He provides specific case examples highlighting the importance of a solid foundation and strategy.
David Crolene, VP, Data Analytics & AI, EXL Service
Wednesday, May 14: 5:00 p.m. - 6:00 p.m.
AI and related technologies, such as machine learning, neural networks, and text analytics, have created new and powerful opportunities for businesses. Innovative uses of language models integrated with generative AI hold enormous promise for positive change within enterprises. At the same time, ethical considerations and the widely known tendency of generative AI to fabricate information must be top of mind. The AI & Machine Learning Summit is a 2-day immersion into the possibilities inherent in an AI-driven future, offering the opportunity to harness AI & ML’s transformative potential.
Designed for chief information officers, chief data officers, data scientists, data engineers, enterprise architects, data analytics directors and managers, application developers, and tech-savvy business leaders.
Wednesday, May 14: 10:00 a.m. - 10:45 a.m.
The excitement around AI technologies tends to beg the question of how scalable AI products are or could be.
As the tech ecosystem embraces AI, leaders need proven strategies to scale effectively. Drawing from Isler's experience leading AI at Instagram, Spotify, and SiriusXM, he shares battle-tested frameworks that serve billions of users globally. From Instagram Reels recommendations to Spotify's personalization features, these cross-industry insights will help companies navigate their AI journey and compete in the global market.
Derya Isler, VP of AI, SiriusXM
Wednesday, May 14: 12:00 p.m. - 12:45 p.m.
As AI technologies rapidly advance, it's crucial to prioritize responsible AI development.
Imagine a world where AI systems perpetuate biases, exacerbate inequalities, and undermine trust. Well, you can stop "imagining," because it will happen if we are not careful while developing such systems. AI technology is full of biases, unfairness, and socio-technical problems that can have unexpected results if not properly understood. Join Gupta on a journey to explore the nuances of fairness constraints that can make or break an AI system's integrity. She invites discussion on real-world scenarios of unfair AI and in-depth learning of what fairness constraints are and how to apply them using open source Python libraries.
Parul Gupta, Senior Production Engineer, Meta
Wednesday, May 14: 12:45 p.m. - 2:00 p.m.
Wednesday, May 14: 2:00 p.m. - 2:45 p.m.
Learn about foundations for tracking multiple dimensions of the data ecosystem, such as AI model, metrics, and event lineages.
Modern data ecosystems face increasing complexity with the proliferation of AI models, metrics, and event streams. Shetty explores how Spotify tackled this challenge by implementing a comprehensive data lineage framework that serves as the foundation for tracking multiple dimensions of its data ecosystem. He shares the journey from a proprietary solution to an open source approach, demonstrating how data lineage can be extended beyond traditional use cases to encompass AI model lineage, metrics lineage, and event lineage.
Dhanush Shetty, Product Manager, Data Management, Spotify
Wednesday, May 14: 2:45 p.m. - 3:15 p.m.
Wednesday, May 14: 3:15 p.m. - 4:00 p.m.
The delivery of critical AI/ML initiatives enhances clinical efficiency and user experiences in healthcare and elsewhere.
Challenges and strategies for managing AI/ML projects in the healthcare domain exist. Drawing from more than 19 years of experience in software engineering and project management, Mishra delves into the end-to-end lifecycle of AI/ML projects, from conception to deployment. He includes effective risk mitigation strategies, translating stakeholder needs into technical requirements, and fostering cross-functional collaboration among data scientists, engineers, and product managers.
Sarbaree Mishra, Program Manager, Molina Healthcare Inc.
Wednesday, May 14: 4:15 p.m. - 5:00 p.m.
Assess the maturity of AI governance capabilities and explore the potential impact of future AI regulations.
AI governance remains one of the biggest hurdles to realizing the full potential of AI and ML, even in advanced organizations. While governance frameworks exist, they fail to connect high-level principles with the practical actions needed to manage risk effectively across the AI lifecycle. The result is stalled projects, increased risks, and missed opportunities for impact. Carlsson, host of the Data Science Leaders podcast, discusses how firms can bridge the governance gap, assess their governance maturity, and build scalable capabilities that ensure trust, compliance, and accelerated AI adoption.
Kjell Carlsson, Head of AI Strategy, Domino Data Labs
Wednesday, May 14: 5:00 p.m. - 6:00 p.m.
Thursday, May 15: 8:00 a.m. - 9:00 a.m.
Thursday, May 15: 9:00 a.m. - 10:00 a.m.
Transforming AI hype into business outcomes is the objective of getting your business AI-ready. Based on Welsch’s AI Leadership Handbook: A Practical Guide to Turning Technology Hype Into Business Outcomes, he draws on more than 60 interviews he conducted with AI leaders and experts to offer strategic insights into AI implementations with a nine-step approach. Gain practical knowledge on fostering innovation, driving human-AI collaboration, and leading AI initiatives. This AI leadership keynote talk covers strategy, leadership, culture, and security, equipping you with tools to boost AI literacy and achieve measurable business success.
Andreas Welsch, Founder & Chief AI Strategist, Intelligence Briefing
Thursday, May 15: 10:00 a.m. - 10:45 a.m.
Emerging Technologies and Trends in Data and Analytics takes you through the most exciting developments reshaping the industry and helping businesses close the data value gap, from the rise of data fabric solutions and edge analytics to the spread of XOps and real-time capabilities. Attend this track to dive into innovative new technologies and practices to meet growing challenges and opportunities.
Designed for chief information officers, chief data officers, digital transformation leaders, enterprise architects, data architects, data engineers, data scientists, and data management and analytics professionals.
Thursday, May 15: 10:45 a.m. - 11:30 a.m.
Project management skills play a vital role within data-driven organizations
Data and analytics (D&A) projects often fail at alarming rates, with up to 80% delivering little or no value. The explosion of AI and its transformative potential is reshaping industries, making it imperative for organizations to master the basics of D&A to seize these opportunities. Sax explores a proven system for managing D&A projects that minimizes failure risks and drives meaningful outcomes. He introduces a practical project identification matrix to classify and manage projects effectively, mitigating the top reasons for failure.
Jonathan Sax, VP Analytics, US Bank
Thursday, May 15: 11:45 a.m. - 12:30 p.m.
Advanced techniques driving modern recommender systems include graph neural networks to uncover patterns in user-time relationships.
In today’s world, recommender systems shape what we watch, buy, read, and listen to, seamlessly tailoring experiences to match our unique preferences. This talk takes you behind the curtain of these algorithms, focusing on platforms like Netflix, YouTube, Amazon, and Spotify, to uncover how these systems work, from handling billions of datapoints to making personalized recommendations in real time—and how they can become smarter, fairer, and more impactful.
Disha Lamba, Data Scientist, CVS Health
Thursday, May 15: 12:30 p.m. - 1:45 p.m.
Thursday, May 15: 1:45 p.m. - 2:30 p.m.
A data integration hub (DIH) can simplify development of multiple front-end applications, providing auditability, simplicity, low latency, and low infrastructure cost.
Systems of record (SORs) are scattered across large enterprises, each individually fit for a specific purpose. If you want to use that data to digitally transform business, you need to access all your data to drive applications and analytics. A data integration hub (DIH) isn’t another database. It’s an architectural concept that fits in between SORs and front-end applications. Necessary data is provided at real-time speed, and long-term data is reconciled across sources and persisted dependably, regardless of source format. Come to this talk to see some real-world implementations in financial, telecom, transportation, and logistics industries of a DIH. Learn the concepts, tips, tricks, and gotchas.
Paige Roberts, Head of Technical Evangelism, GridGain
Thursday, May 15: 2:45 p.m. - 3:30 p.m.
In today’s fast-paced business environment, efficient data management is key to driving strategic insights and informed decision making.
This session delineates best practices for simplifying data processes, improving data quality, and ensuring consistency across reporting platforms. Learn how to standardize data governance frameworks, automate manual tasks, and create a centralized data catalog for better accessibility and control. Discover strategies for effective collaboration between data stewards and business teams to maximize the value of data assets. Whether you're working with SQL databases, Power BI, or other reporting tools, this presentation provides actionable steps to optimize your data management practices, reduce redundancy, and improve data reliability.
Yashasvi Singh, Data Steward, Navy Federal Credit Union
Navigating the Hybrid and Multi-Cloud Future explores the growing array of cloud types and services being adopted by enterprises, accompanying opportunities and challenges, and how enterprises are rethinking traditional data management technologies and practices to truly unlock the value of cloud data and analytics in the real world. Attend this track to dive into key solutions and strategies to overcoming hot-button issues, from migration mistakes to licensing and FinOps to performance, security, governance, and integration tips.
Designed for chief information officers, chief data officers, digital transformation leaders, IT managers and directors, enterprise architects, data architects, cloud architects and engineers, and data management professionals.
Thursday, May 15: 10:45 a.m. - 11:30 a.m.
Find actionable strategies for harnessing Guidewire to build scalable, efficient, and user-friendly applications.
The intricacies of designing and developing robust insurance applications using the Guidewire platform showcase best practices for PolicyCenter and ClaimCenter configuration, integration strategies, and optimizing business rules for different lines of insurance. The session also covers advanced concepts such as creating custom UI components, managing complex workflows, and ensuring data security and compliance in enterprise-grade insurance solutions.
Ravi Teja Madhala, Senior Software Developer Analyst, Mercury Insurance Services, LLC
Thursday, May 15: 11:45 a.m. - 12:30 p.m.
The rise of the proprietary cloud data warehouse helped modernize data warehousing by providing scalability, convenience, and, most importantly, flexibility and openness.
Once data became available in the cloud, it was possible to use it for more use cases, including user-facing analytics, dashboarding, observability, machine learning, and so on. This led to recurrent performance challenges, a degraded user experience, significant runaway costs, and also vendor lock-in. Steinkamp discusses the role open source technologies (open source real-time analytical databases such as Druid, Pinot, and ClickHouse) and open data lake standards (Iceberg, Hudi, Delta Lake) play in transforming the modern data stack and helping organizations move away from a monolithic cloud data warehouse.
Zoe Steinkamp, Senior Developer Advocate, Clickhouse
Thursday, May 15: 12:30 p.m. - 1:45 p.m.
Thursday, May 15: 1:45 p.m. - 2:30 p.m.
Data security has become a top priority as organizations increasingly migrate their operations to the cloud.
With the exponential growth of data stored and processed in cloud environments, the stakes for securing sensitive information are higher than ever. GenAI is emerging as a transformative force in cloud data security, offering innovative solutions to combat threats such as malware, ransomware, and phishing. However, this revolutionary technology comes with its own set of challenges. The dual-edged nature of GenAI in cloud data security provides unprecedented capabilities to detect and mitigate security threats through advanced pattern recognition and automated threat response but also raises concerns about data privacy, ethical usage, and its potential misuse.
Hardik Ruparel, Software Engineer-3, Nutanix and Founder, Steal-Mode Cloud Project
Thursday, May 15: 2:45 p.m. - 3:30 p.m.
Data in the cloud has become commonplace but at what cost?
Join this panel discussion as we consider the advantages and drawbacks of placing data in the cloud. Is it, in fact, the most cost-effective solution? What about privacy and confidentiality? What migration issues exist?
As the builders and keepers of the data systems and pipelines that fuel insights, data engineers are expected to wear many hats, and their role continues to grow in importance. With many organizations focused on accelerating their AI and analytics capabilities, there is an enormous demand for secure, trusted, easily accessible data. For data engineers, this means new user requirements, new workloads, and more challenges. Data infrastructure complexity, data silos, governance, and security all top the list. Still, the world of data engineering is evolving fast—from cloud data platforms and tools for ingesting, processing, integrating, and analyzing data to data catalogs, active metadata, and data observability. Attend this boot camp to dive into the latest technologies and strategies for success.
Designed for data engineers and anyone interested in data engineering.
Thursday, May 15: 10:45 a.m. - 11:30 a.m.
A critical intersection exists between design and development in software engineering.
Drawing from his extensive background in Java development, full-stack design, and system architecture, Manda discusses best practices for aligning technical requirements with user-centric design principles and shares insights from his work on cutting-edge technologies, including his transition from legacy systems to cloud-native architectures and his application of Agile methodologies to streamline workflows.
Jeevan Kumar Manda, Project Manager, Metanoia Solutions, Inc.
Thursday, May 15: 11:45 a.m. - 12:30 p.m.
Learn how companies can scale their data strategies, fuel advanced workloads, and centralize sensitive information without compromising trust.
Data is the lifeblood of modern enterprises, but with every petabyte collected, the stakes grow higher. Whether it’s customer, patient, or financial data, organizations are under mounting pressure to protect sensitive datasets from exposure while navigating an increasingly complex regulatory landscape. Yet many businesses still rely on outdated approaches that not only stifle innovation but also increase vulnerabilities. Kundavaram unpacks lessons learned from Fivetran’s experience working with global enterprises, sharing actionable insights on bridging cloud and on-prem environments, ensuring airtight data governance, and unleashing the full power of your data—without losing control.
Anjan Kundavaram, Chief Product Officer, Fivetran
Thursday, May 15: 12:30 p.m. - 1:45 p.m.
Thursday, May 15: 1:45 p.m. - 2:30 p.m.
AI can streamline your approach to data.
AI is not just transforming data pipelines for applications—it’s also streamlining the process of building these pipelines. AI-assisted tools can automate much of the tedious work traditionally done by data engineers. Join this session to learn about the opportunities to accelerate your data team efficiency and reliability.
Itamar Ben Hamo, CEO, Rivery
Thursday, May 15: 2:45 p.m. - 3:30 p.m.
Serverless data engineering refers to designing and managing data workflows using mostly cloud computing resources based on certain events.
In a serverless paradigm, developers focus on creating and running data pipelines without managing the underlying server infrastructure. Instead, the cloud provider dynamically allocates resources and handles scaling, availability, and maintenance. Serverless data engineering enables agile, scalable, and cost-effective solutions for modern data workflows. By offloading infrastructure management to cloud providers, organizations can innovate faster and focus more on delivering insights and value from their data.
Jerry Locke, Snowflake Practice Leader, Perficient
AI and related technologies, such as machine learning, neural networks, and text analytics, have created new and powerful opportunities for businesses. Innovative uses of language models integrated with generative AI hold enormous promise for positive change within enterprises. At the same time, ethical considerations and the widely known tendency of generative AI to fabricate information must be top of mind. The AI & Machine Learning Summit is a 2-day immersion into the possibilities inherent in an AI-driven future, offering the opportunity to harness AI & ML’s transformative potential.
Designed for chief information officers, chief data officers, data scientists, data engineers, enterprise architects, data analytics directors and managers, application developers, and tech-savvy business leaders.
Thursday, May 15: 10:45 a.m. - 11:30 a.m.
A semantic layer provides GenAI with a programmatic framework to make organizational context, content, and domain knowledge machine readable.
Enterprise AI’s business potential cannot be overstated: By employing standards-based semantic components such as metadata, business glossaries, taxonomy/ontology, and graph solutions, a semantic layer arms organizations with a framework to aggregate and connect siloed data and unstructured content, explicitly provide business context for data, and serve as the layer for explainable GenAI solutions. Tesfaye and Majumder present case studies explaining semantic layer technical architectures and exploring the components that enable enterprise scale data transformation efforts.
Lulit Tesfaye, Partner & VP, Enterprise Knowledge, LLC
Urmi Majumder, Principal Data Architecture Consultant, Enterprise Knowledge, LLC
Thursday, May 15: 11:45 a.m. - 12:30 p.m.
Revolutionize workflows via AI agents.
Think of AI Copilot as a top-secret agent with a special mission to automate tasks, streamline processes, and revolutionize your workflow. Discover how to equip your agent with a variety of gadgets (actions) to infiltrate various business processes and uncover opportunities for automation. Learn how to harness the power of AI to maximize efficiency, reduce costs, and achieve remarkable results.
Nathan Bensch, VP, Microsoft Enterprise Services, enVista
Thursday, May 15: 12:30 p.m. - 1:45 p.m.
Thursday, May 15: 1:45 p.m. - 2:30 p.m.
A quick look at building a data project.
Asnani covers every step of the process of building a customer churn prediction pipeline—from data preprocessing and feature engineering to tracking experiments, building ML pipelines, and training high-performing classification models. The entire workflow is managed within MLFlow, allowing developers to build, track, and deploy pipelines seamlessly. It uses the Streamlit interface to show predictions as a real-time visualization of churn predictions. This session offers a practical and approachable way to implement customer churn prediction for both beginners and experienced data practitioners.
Priyanka Asnani, Senior ML Engineer, Fidelity Investments
Thursday, May 15: 2:45 p.m. - 3:30 p.m.
AI, robotic process automation (RPA), and machine learning (ML) can transform government operations.
With efficiency, cost, and service enhancements being demanded of the federal government, the adoption of AI, robotic process automation (RPA), and machine learning (ML) is emerging as a great shift. These technologies can foster innovation and alter the processes and roles of various government agencies. AI-driven systems offer new data analysis possibilities that allow agencies to speed up decision making. These technologies are enhancing processes which include claim processing, records management, and a range of other activities contacted by the citizens, thereby improving the speed and quality of delivery of services to citizens.
Hariharan Pappil Kothandapani, AVP, Lead Data Science & Analytics Developer, Fidelity Investments and Federal Home Loan Bank of Chicago
Thursday, May 15: 3:45 p.m. - 4:15 p.m.
Discover the future of conference engagement with an innovative idea that uses AI to record, transcribe, and build an interactive model around presentation content. Experience a live demo of the AI-powered chatbot used at Data Summit, designed to foster dynamic conversations by asking follow-up questions and providing insightful answers. You can interact with the bot to explore topics, dive deeper into sessions, and learn in a whole new way. This groundbreaking approach extends the value of conversations, making knowledge accessible and engaging.
Brian Pichman, Director, Strategic Innovation, Evolve Project
Thursday, May 15: 4:15 p.m. - 5:00 p.m.
Moving beyond speculation to data, this keynote presents analysis and insights from our comprehensive Q1 2025 market study spanning 200-plus organizations. We examine how companies are actually implementing modern enterprise data architectures to support analytics and AI initiatives, revealing current adoption rates, investment patterns, and expected outcomes. Building on our 2023 study's foundation, which tracked early investments in modern data architectures, we survey the evolution of data platforms by adding vector databases, knowledge graphs, and semantic layers. The session cuts through market hype to present evidence-based results and insights on which architectural patterns—from data fabric to data lakehouse—deliver measurable value and how organizations successfully balance AI innovation with enterprise data management and governance requirements.
John O'Brien, Principal Advisor & Industry Analyst, Radiant Advisors