Follow us on #DataSummit
Data Summit 2023 is a unique conference that brings together IT practitioners and business stakeholders from all types of organizations. Featuring workshops, panel discussions, and provocative talks attendees get a comprehensive educational experience designed to guide them through all of today’s key issues in data management and analysis. Whether your interests lie in the technical possibilities and challenges of new and emerging technologies or using Big Data for business intelligence, analytics, and other business strategies, we have something for you!
Access to all tracks including the two-day AI & Machine Learning Summit and one-day Data Mesh & Data Fabric Boot Camp is included when you register for an All-Access Pass or Full Two-Day Conference Pass. Attendees may switch between tracks as they choose. Only interested in AI & Machine Learning Summit or Data Mesh & Data Fabric Boot Camp? Stand-alone registration for these events is also available.
Tuesday, May 9: 9:00 a.m. - 12:00 p.m.
Located in Martha’s Vineyard A, Main Lobby Level
Every organization faces unique challenges in becoming data-and analytics-driven with its people, business processes, innovation, and competitive market. So why would an off-the-shelf, one-size-fits-all modern data architecture trend be the answer? It's not. This single plan addresses those challenges and helps you create the modern data architecture your organization needs, align it with its current business strategy, and incrementally unlock scalable business value. We follow a four-step process that translates business goals, leverages an analytics capabilities framework, and applies modern data architecture principles for a logical data architecture that allows for prioritization, planning, and developing people's proficiency. Next, we determine which architecture trends and modern data stacks can deliver data management and analytics capabilities most effectively in your organization. Data leaders and architects with this knowledge can demystify and communicate how emerging technologies and architecture trends from cloud-native to data mesh and data fabrics matter in your organization.
John O'Brien, Principal Advisor & Industry Analyst, Radiant Advisors
Tuesday, May 9: 9:00 a.m. - 12:00 p.m.
Located in Martha’s Vineyard B, Main Lobby Level
Knowledge graphs are a valuable tool that organizations can use to manage the vast amounts of data they collect, store, and analyze. Enterprise knowledge graphs’ representation of an organization’s content and data creates a model that integrates structured and unstructured data. Knowledge graphs have semantic and intelligent qualities to make them “smart.” Attend this workshop to learn what a knowledge graph is, how it is implemented, and how it can be used to increase the value of your data. This is a very interactive workshop, so be prepared to learn about knowledge graphs and actually build one.
Joseph Hilger, COO, Enterprise Knowledge, LLC
Sara Nash, Principal Consultant, Enterprise Knowledge LLC
Tuesday, May 9: 1:00 p.m. - 4:00 p.m.
Located in Martha’s Vineyard B, Main Lobby Level
An introduction, overview, and update for professionals who interact with data privacy functions and want to understand privacy and data security better. Explore the regulatory environment and the ways in which businesses are responding (and failing to respond). Consider why privacy is important and what's changed to make it even more relevant today. How have evolving tech and consumer preferences affected information practices and data security? Learn simple models to harden yourself and your coworkers to privacy and security threats. Join privacy professional Jockisch in this interactive workshop to understand how to better protect the sensitive data in your business.
Jeff Jockisch, CEO, CIPP/US, PrivacyPlan and Your Bytes Your Rights, Avantis Privacy, Data Collaboration Alliance
Tuesday, May 9: 1:00 p.m. - 4:00 p.m.
Located in Martha’s Vineyard A, Main Lobby Level
In an information economy, data is the currency of business. Digital upstarts are disrupting every industry, making it imperative that organizations have a strong data strategy that turns data into insights and profitable activity. Organizations need data to streamline operations and reduce costs, to improve decisions and plans, and to grow revenues and profits. A data strategy is an enterprise-wide plan to harness data and analytics to achieve business goals. At a high level, it is a blueprint for creating a data-driven organization; at a low level, it is a set of blueprints for designing a data architecture to acquire, transform, and deliver data to business users and applications. Eckerson explains the keys to developing an actionable data strategy, how to build an executable roadmap, and creating a data strategy that aligns with business needs, based on an organization’s unique circumstances, culture, and data maturity.
Wayne Eckerson, President, Eckerson Group
Wednesday, May 10: 8:45 a.m. - 9:30 a.m.
AI and the internet are transforming our understanding of how the future happens, enabling us to acknowledge the chaotic unknowability of our everyday world.
Back when we humans were the only ones writing programs, data looked like the oil fueling those programs. But now that machines are programming themselves, data looks more like the engine than its fuel. This is changing how we think about the world from which data arises, and that data is now shaping as never before. We’ve accepted that the intelligence of machine intelligence resides in its data, not just its algorithms—particularly in the countless, complex, contingent, and multidimensional interrelationships of data. But where does the intelligence of data come from? It comes from the world that the data reflects. That's why machine learning models can be so complex, we can't always understand them. The world is the ultimate black box. Weinberger looks at the implications of this for people who work with data.
David Weinberger, Harvard metaLAB and Harvard Berkman Klein Center
Wednesday, May 10: 9:30 a.m. - 9:45 a.m.
iRobot, the leading global consumer robot company, designs and builds thoughtful robots and intelligent home innovations that make life better for customers across the globe. With over 50 million units sold worldwide, iRobot relies on accurate, highly reliable data to drive decision making and power operations across the business to fuel their explosive growth. Leber discusses why and how the team invested in data observability with Monte Carlo to improve data quality across the business and help its company maximize the potential of their data.
Sponsored by
Christina Leber, Principal Data Software Engineer, iRobot
Wednesday, May 10: 9:45 a.m. - 10:00 a.m.
Are you drowning in data but lacking in insight? Eighty percent of business leaders say data is critical in decision-making, yet 41% cite a lack of understanding of data because it is too complex or not accessible enough. Companies are using graph databases to leverage the relationships in their connected data to reveal new ways of solving their most pressing business problems and creating new business value for their enterprises. Mohr shows real-world use cases that include real-time recommendations, fraud detection, network and IT operations, AI/ML, supply chain management, and more.
Sponsored by
Dave Mohr, Regional VP, Neo4J
Wednesday, May 10: 10:45 a.m. - 11:45 a.m.
Collecting, querying, and manipulating data are important, but analyzing it well provides a competitive advantage.
Organizations employ many data analysts embedded in various departments and business units. These data analysts cost organizations millions of dollars in wages annually. Surprisingly, corporate data teams don’t know most of the data analysts in their organization, nor do they have a strategy to align them or optimize their organization’s investment in them. Eckerson presents a comprehensive strategy for empowering data analysts; describes how to make a business case for developing a self-service strategy that optimizes their time and output; and explains how to motivate and retain data analysts (people), how to organize and manage data analysts (organizations), how to govern data analyst output (process), and how to select tools and products that enable them to work as efficiently and effectively as possible (technology).
Wayne Eckerson, President, Eckerson Group
Wednesday, May 10: 12:00 p.m. - 12:45 p.m.
We’re all looking for insights that can be gleaned from our data. In addition to being data-driven, consider being insights-driven
Collecting more data doesn't always translate into better business outcomes. The best companies are shifting their entire culture to obsess about generating more insights that can be turned into tangible results in profits, revenue, and growth. Ugarte shares strategies for how teams can start to make the shift to being insights-driven and how to turn those insights into profitable decisions. Learn how to determine the ROI of your data and why more insights are the key to increasing the value of data. Look at your decisions as a process instead of one-off and start optimizing this internal process. Connect the dots in how data becomes an insight and then into a winning decision.
Ruben Ugarte, Decision Strategist, Practico Analytics
The MEMIC Group was no different from others in that it was dealing with a fragmented data ecosystem, from data in the cloud, in the enterprise data warehouse (EDW), and in multiple application databases and files. As a result, it was difficult to access, analyze, and manage data. Holbrook discusses how MEMIC modernized its data architecture to empower business users to connect to a single location to gain real-time access to quality data and provided the security team with a centralized point to monitor and manage all data access.
Matthew J. Holbrook, Director, Enterprise Data, Analytics and Architecture, The MEMIC Group
Wednesday, May 10: 2:00 p.m. - 2:45 p.m.
A key point about succeeding with digital transformation involves obtaining commercial data services and utilizing their capabilities to the fullest.
Acquiring commercial data services is not as simple as signing a standard contract without a preliminary and thorough investigation and an understanding of how to broker, evaluate, and integrate the data into a corporate or governmental data lake, data warehouse, or data mesh. Huffine calls your attention to the complexities of licensing, negotiations, usage, and ROI. Additionally, he covers environmental, financial, and core data (ZIP codes, GIS layers, etc.) as well as use cases for modeling, dashboards, research, and regulatory activities.
Richard Huffine, Assistant Director, Enterprise Information & Records, Federal Deposit Insurance Corporation
Wednesday, May 10: 3:15 p.m. - 4:00 p.m.
Some data is meant to be shared, but other data requires being secured so it doesn’t get into the wrong hands.
One of the biggest concerns of small, middle-sized, and enterprise companies involves securing data and ensuring integrity when collaborating or integrating with other services. Seasoned executives know that issues of IT security, compliance, regulations, cyber liability insurance, supply chain requirements, incident response, and forensics should all be addressed in the contracts. However, it is easier to identify contractual protections than to obtain them. Agreeing to contractual commitments is a function of both liability to the enterprise and negotiating leverage.
David Adler, President/Founder, Adler Law Group
With Datasparc's flagship product DBHawk, users receive zero trust database access to the data they need. Find out how DBHawk provides secure password-less access to on-premise and cloud databases.
Manish Shah, CEO, Datasparc Inc.
Wednesday, May 10: 4:15 p.m. - 5:00 p.m.
Challenges in licensing compliance have not diminished; in fact, they are more challenging than ever.
Unisphere’s report, “Managing the Software Audit: 2022 Survey on Enterprise Software Licensing and Audits Trends,” surmised that, due to lost revenue as a direct result of COVID-19, major software vendors increased the pace of their software licensing audits to generate additional revenue. Your risk of a software audit is greater now than at any point in time. Corey and Sullivan discuss the significant findings of the survey and explore the stealth audit, the newest tool in the vendor audit toolbox. They explain the difference between vendor policy and contractual obligation and expose how software licensing trolls have weaponized software vendor audits. Don’t be the next victim!
Michael Corey, Co-Founder/Chief Operating Officer, LicenseFortress
Don Sullivan, Product Line Manager, Business Critical Applications, Broadcom (VMware)
Wednesday, May 10: 10:45 a.m. - 11:45 a.m.
Legacy data architecture needs to be modernized to meet today’s data and analytics needs.
Capital One’s move to the cloud required it to modernize its data operations for this new environment. This meant learning how to balance the flexibility and efficiency of managing data in the cloud in order to generate the most value from its data. Bharathan shares more on the decisions Capital One made throughout this journey around monitoring, access, schema management, resiliency, cost, security, load patterns, and governance. He shares lessons learned—what worked, and what didn’t—and more on the tools Capital One developed to ensure a well-managed, well-governed cloud data platform.
Ganesh Bharathan, Director, Data Engineering, Capital One Software
Over the past few years, the IT landscape has experienced significant disruptions. Many of these transformations are reshaping database administrator roles in organizations, from the introduction of new technologies to the increasing size and complexity of the database environment. Additionally, emerging data types and modern applications drive enterprises to adopt new data platforms. DBAs are under pressure to continually evolve to support ongoing innovation. The roles and duties traditionally performed by DBAs have changed as cloud adoption and automation become commonplace.
Steve Doughty, Director, Solution Development, Datavail
Check the website for updates.
Tal Doron, AVP Solution Architects, GigaSpaces
Wednesday, May 10: 12:00 p.m. - 12:45 p.m.
Modern data architectures optimize for quick delivery at scale.
Zielnicki explores how Stitch Fix evolved its large suite of recommender models into a novel model architecture that unifies data from client interactions to deliver a holistic and real-time understanding of their style preferences. Stitch Fix’s Client Time Series Model (CTSM) is a scalable and flexible sequence-based recommender system that models client preferences over time, based on event data from various sources, to provide multi-domain, channel-specific recommendation outputs. The model has enabled Stitch Fix to continuously provide personalized fashion at scale, like no other apparel retailer.
Kevin Zielnicki, ML Architect, Data Science, Stitch Fix
Johnson shares how Holiday Inn Club is serving its customers with more reliable, up-to-date access to Salesforce data. Learn how Holiday Inn Club took just 2 weeks to build automated data integration pipelines that support organizational growth and better surface massive volumes of transactional data in Azure and SQL Server warehouses for holistic reporting.
Jerod Johnson, Senior Technology Evangelist, CData Software
Wednesday, May 10: 2:00 p.m. - 2:45 p.m.
DataOps affects everyone involved in the data ecosystem, which pretty much encompasses all employees, so adopting a strategy for agile data delivery is important.
Industry publications and thought leaders have been touting the benefits of composable design for both business and architecture. For roughly 10 years, Composable Analytics has been ahead of that curve. We were founded on using composable design strategies to get actual projects up and running and providing value for clients. Vidan shares some of the real-world lessons learned over that time and explores some of the more common usage patterns that Composable Analytics has found that help put theory to practice and composable architecture into production.
Andy Vidan, CEO & Co-Founder, Composable Analytics, Inc.
Learn more about ransomware and the different options to recover your data. During this presentation we will discuss the following: whether to copy, replicate, mirror, image and or back-up your data; the different data recovery options; the advantages and disadvantages of each option; when to use and when to avoid them.
Geoff Rennie, Master Pre-Sales Engineer, Data Protector, OpenText
Wednesday, May 10: 3:15 p.m. - 4:00 p.m.
Tools to collect, organize, store, and analyze data are part of a modern data stack that can transform data.
As the data landscape continues to evolve, data usage has transformed from humble beginnings in recordkeeping to a strategic asset that powers key business decisions, customer experiences, developer toolchains, AI platforms, and more. This transformation has influenced the design and implementation of modern data stacks. Li explores technologies that underpin modern data infrastructure, including Presto and Apache Pinot, explaining their ability to process federated, real-time data at scale. Using real-world use cases, he highlights practical considerations for crafting a stack that can handle the diverse and complex data needs of modern businesses.
Andy Li, Senior Software Engineer
Wednesday, May 10: 4:15 p.m. - 5:00 p.m.
Putting data first rather than taking an application-centric approach is the mark of a modern data architecture.
Emerging modern business-oriented data architectures such as data hub, data lake house, data fabric, and knowledge graphs put data first. By treating data as a supply chain, from which applications hang, as opposed to traditional application-centric approaches, where data is suborned into silos, Bentley explains how to solve complex data problems simply. Data unity, data security, data governance, data context, and data quality are all ensured throughout the data lifecycle without the complexity of multiple integrations from multiple vendors and components. Through real-world case studies, Bentley highlights the advantages of this approach, lessons learned, and practical advice in implementing these modern architectures that augment your existing data ecosystem without the need for “rip and replace.”
Jeremy Bentley, Head, Strategy, MarkLogic
While data quality tools find issues in production, it is too little, too late. Production data issues are expensive to fix and cause business discontinuity and reputation risk. iceDQ believes companies should adopt a left-shift approach to prevent faulty processes and data from entering production.
Sandesh Gawande, CTO, iCEDQ
Wednesday, May 10: 10:45 a.m. - 11:45 a.m.
Data silos continue to impede access to needed information. Solutions are on the horizon.
With a 133-year history, Northern Trust has a backbone of IT infrastructure built decades ago, when on-premises solutions dominated the technology landscape. Due to the complexity of global regional regulatory requirements and the limitations of legacy systems, valuable data assets are maintained and isolated only in online transactional processing (OLTP) databases. The company faced challenges in data sharing, management, and governance in supporting enterprise-level analytics projects to meet business needs and growth. A digital modernization initiative took place that had a data mesh ecosystem as a critical component, leveraging cloud services on Azure and other modern technologies.
Ming Yuan, SVP -- Data Mesh, Shared Services, IT, Northern Trust Corp.
Pratima Tripurneni, VP, Head, Enterprise Data Delivery, Northern Trust Corp.
Wednesday, May 10: 12:00 p.m. - 12:45 p.m.
Enable your data team to get the most value from their time and quickly deliver needed business insights through a next-generation data fabric methodology.
MacWilliams introduces a technology agnostic methodology that solves the common challenges facing data teams and focuses on the processes among technologies—on-board data faster, flex automatically when data changes, create solutions that are manageable across technologies, and provide the foundation to be able to monitor and maintain your data fabric for the future. The methodology integrates with already existing technologies. Starting with the end in mind, MacWilliams covers how to best monitor and maintain your platform, how to maximize data team capacity, how to utilize meta-data to streamline your team’s development, and where to build custom and leverage modern technologies.
Doug MacWilliams, Director, West Monroe
Wednesday, May 10: 2:00 p.m. - 2:45 p.m.
Moving to the cloud is now a normal function but presents interesting new challenges.
As companies look to scale, they face new and unique challenges related to data management in the cloud. Data mesh offers a framework and a set of principles that companies can adopt to help them scale a well-managed cloud data ecosystem. Learn how Capital One approached scaling its data ecosystem by federating data governance responsibility to data product owners within their lines of business and hear how companies can operate more efficiently by combining centralized tooling and policy with federated data management responsibility.
Patrick Barch, Senior Director, Product Management, Capital One Software
Making sense of all your input data isn’t fun, especially when consuming inputs from 10s to 1,000s of data sources daily. If your data teams are orchestrating massive amounts of data across multiple data pipelines, it’s nearly impossible to feel confident in the data quality within your data warehouse. Instead of retroactive data monitoring, it’s time for a more proactive approach to ensure better data quality for your warehouse.
Ryan Yackel, CMO, IBM Databand
Wednesday, May 10: 3:15 p.m. - 4:00 p.m.
New technologies can transform companies’ data journeys.
Data fabrics and data meshes are promising paradigms for helping organizations on their data journeys. Data fabric is a new approach complementing the existing infrastructure and data management technology, accessing the data on demand as it’s needed by the consumers of the data, with centralized metadata and governance. Data mesh accesses the data on demand, providing the metadata and governance capabilities at the edges of the organization, where the data resides, enabling agility and autonomy throughout the organization. While much of the conversation around data fabrics and data mesh has been primarily about which approach or architecture is “better,” Fried discusses how the real value of these concepts isn’t rooted in an “either/or” approach and why they must be viewed as complementary.
Jeff Fried, Director, Platform Strategy & Innovation, InterSystems
Wednesday, May 10: 4:15 p.m. - 5:00 p.m.
Data architectures tend not to be static and new approaches are always welcome.
The concept of data mesh has resonated strongly with both data professionals and the broader engineering community. Loose coupling, enablement of federated development, and data sharing ease the difficulty of data management in both large and small organizations, as well as bringing data systems closer to parity with modern microservice-based systems. Cordo explores how the adoption of an event-based data architecture can enable an organization's sustainable transition to data mesh. This includes an overview of event-based architecture, architectural patterns for event-based data systems, and organizational considerations.
Elliott Cordo, CEO/Founder/Builder, Data Futures, LLC
Wednesday, May 10: 10:45 a.m. - 11:45 a.m.
There’s no stopping the introduction of AI-based technologies into the enterprise.
Data science methods provide a means to establish analytic tradecraft, capable of managing a large amount of data, allowing for full characterization of actor behaviors, and providing valuable insights. As data volume increases, AI/ ML plays a significant role in this "high data entropy" space, providing users with the means to combine multi-sourced datasets, with the goal of learning and identifying patterns, and develop actionable insights while assuring they follow the organization’s law and policy boundaries. Rodriquez presents a case study on how the intelligence community (IC) is addressing these challenges by establishing innovative AI/ML governance and data management methodologies, supporting the development of a policy-compliant AI governance ecosystem, predicating strategies to enforce legal and policy considerations, and establishing data controls.
Efrain Rodriquez, Director, Business Intelligence and Metrics, U.S. Department of Defense (DoD)
Wednesday, May 10: 12:00 p.m. - 12:45 p.m.
MLOps can streamline ML development, thus increasing operational effectiveness.
Jablonski looks at the journey to defining and implementing an MLOps solution for your organization. Jablonski begins with the metrics necessary for successful model lifecycle measurement, then discusses the technology stack to be deployed and the operational model necessary for success at scale. These three must all be defined and built collectively to ensure alignment between operational needs, technology capabilities, and success metrics. High model performance, elimination of bias, and predictability are all key elements of an MLOps strategy.
Joey Jablonski, VP, Analytics, Pythian
Wednesday, May 10: 2:00 p.m. - 2:45 p.m.
Neural networks can be used for many applications in the world of Artificial Intelligence.
ChatGPT, Large Language Models (LLMs), and generative AI have captured the attention of people worldwide. As these tools, many based on neural networks, move from experimental lab projects to widespread usage by the general public, questions arise about their business applications. Can these tools be fine-tuned to enhance competitive intelligence and find insights into customer behavior? How do they aid in answering product marketing questions, monitoring competitor strategies, and influencing decision making? What competitive edge can these AI-based tools provide to you?
David Seuss, CEO, Northern Light
Wednesday, May 10: 3:15 p.m. - 4:00 p.m.
Knowledge graph technology expands by employing neural networks.
Probably the most important reason for building knowledge graphs has been to answer this age-old question: “What is going to happen next?” Given the data, relationships, and timelines we know about a customer, patient, product, etc. (“the entity of interest”), how can we confidently predict the most likely next event? Graph neural networks (GNNs) have emerged as a mature AI approach for knowledge graph enrichment. GNNs enhance neural network methods by processing graph data through rounds of message passing. Aasman describes how to use graph embeddings and regular recurrent neural networks to predict events via GNNs and demonstrates creating a GNN in the context of a knowledge graph for building event predictions.
Jans Aasman, CEO, Franz Inc.
Wednesday, May 10: 4:15 p.m. - 5:00 p.m.
Alec Gilarde, Senior Associate, Bullpen Capital
Thursday, May 11: 9:00 a.m. - 9:45 a.m.
Technology is evolving, and many are understandably fixated on “the next big thing.” As enterprises increasingly recognize that business outcomes are limited less by technological capability and more by the comfort and confidence in adoption and impact, futurist Mike Bechtel suggests that, to arrive at our preferred tomorrows ahead of schedule, organizations need to focus on the element of trust in emerging technologies.
Mike Bechtel, Chief Futurist, Deloitte Consulting LLC
Thursday, May 11: 9:45 a.m. - 10:00 a.m.
Geospatial datasets have long been difficult to join and make available in a single place. Businesses needed specialized resources, skill sets, and tools in order to prepare, clean, and transform the data before extracting value. Perhaps for this reason, only 26% of data strategy leaders today, according to Forrester, report that their organizations are utilizing location intelligence to its full potential. How can businesses go from resource-intensive geospatial data processes to fast-and-easy data unification, pattern detection, and operationalized AI/ML? Patel unveils a new geospatial technology that will transform how businesses extract value from location intelligence.
Sponsored by
Ankit Patel, SVP of Engineering, Engineering, Foursquare
Thursday, May 11: 10:45 a.m. - 11:30 a.m.
As data science evolves, so does its power to affect many different functions.
Financial institutions have rich, customer-centric data and are in a strong position when building AI solutions. Since the functions differ, AI use cases also differ. The recent intense interest in generative AI has given rise to a new aspect of data science: Prompt Engineering, which is basically how humans train models like GPT by creating appropriate prompts. Using her broad data science experience across multiple industries, Kaur brings it alive in this presentation by sharing some best practices about data, generative AI, and the right resources. As a bonus, she shares her thoughts on where the industry is heading and what the focus looks like in the coming years.
Supreet Kaur, AVP, Morgan Stanley
Thursday, May 11: 11:45 a.m. - 12:30 p.m.
Data analysis is at its most powerful when it can be interpreted across many different types of data and points in time.
The fastest-growing types of data this decade are real-time geospatial and time series data fueled by the proliferation of location-enriched sensor data. Innovators are creating new design patterns to harness the unique characteristics of fast-moving spatio-temporal data and develop new data-driven products. This evolution to high-velocity geo-coded and time series data requires real-time analysis to understand spatial relationships over time. Learn about how the rise of real-time location intelligence and the power of extending IoT data to readings over time and space lead to new capabilities for analysis.
Thursday, May 11: 2:00 p.m. - 2:45 p.m.
Organizational data literacy is a crucial part of being data-driven.
Prioritizing data quality is step one to becoming data-driven. However, what is step one to improving data quality? In most IT suites, the argument is better processes, better data fields, and better technology. Weinrich disagrees. Organizational data literacy is the most important step to truly achieving great data quality, data-driven decision making, and data management. Those fluent in data as a second language can identify and understand data sources, analyze them to generate insights, and use those insights to make better decisions that increase organizational value. Weinrich focuses on encouraging attendees to stop keeping data as a guarded gem in a warehouse, and to encourage a data-literate culture that highlights data democratization.
Kasara Weinrich, Principal Consultant, Future of Work, ADP
Thursday, May 11: 3:00 p.m. - 3:45 p.m.
To function at a high level, maximizing its data, organizations need data science talent.
As the imperative to become more data-driven continues to permeate every aspect of business, the demand for data science talent (and more broadly, for data-literate employees across the spectrum) has exploded. The resulting talent pipeline challenge has led organizations to pursue both external hiring and internal upskilling. In many cases, organizations end up looking for data science “unicorns,” i.e., data science professionals qualified across the vast range of skills from computer science to machine learning, from cloud computing to data storytelling. Based on extensive research into the data science profession, Hamutcu presents a framework on key roles in data science and associated skills and knowledge.
Hamit Hamutcu, Co-Founder, Senior Advisor, Initiative for Data Analytics and Data Science Standards, Institute for Experiential AI at Northeastern University
Thursday, May 11: 10:45 a.m. - 11:30 a.m.
Intent on pursuing a data-driven approach? Proceed with caution as unanticipated pitfalls may lie ahead.
"Data-driven" has become one of the most popular buzzwords among tech startups. But what does being data-driven really mean? Unintended bias and relying on raw data impact society in many ways—from hiring practices to credit scoring and underwriting algorithms. Simple but necessary shifts in mindset can change the way startups and investors look at data when making major investment decisions. Lai speaks to adopting a data-informed approach, rather than being data-driven. She shows how to translate data so that it is “apples-to-apples,” revealing the hidden gems in your data lineup, and explains how to filter your data appropriately and what data outliers investors should be looking for to get a more honest and unbiased picture. Finally, she shares the risks incurred when decisions are made without data substantiation.
Ann Lai, Investor/Advisor
Thursday, May 11: 11:45 a.m. - 12:30 p.m.
Analytics cloud options balance the benefits of moving to the cloud with some challenges.
The ability for companies to have access to robust, real-time business intelligence (BI) and analytics functionality is more important now than ever before. Companies increasingly looking to migrate their BI and analytics capabilities to the cloud must now design, develop, and deliver data analytics in a different way. However, with so many cloud analytics options to choose from, how can businesses select the best option to match their unique needs and objectives—and what does the journey to get to the cloud look like?
Tom Hoblitzell, VP, Data Management, Datavail
Thursday, May 11: 2:00 p.m. - 2:45 p.m.
No one wants to indulge in risky behavior when considering a move to the cloud.
Advantages to moving to the cloud include fast scalability, convenience across locations, and elasticity to handle changing workloads. But the more organizations in a wide variety of industries make the move to the cloud, the more we see them tripping over hidden land mines, such as out-of-control costs. In addition, platform lock-in can steal your options for shifting in the future as requirements change yet again. New regulations might restrict data location, or the requirements of your analytics might make more sense on-prem some places, or maybe you need to be on more than one cloud. What about having a high availability capability somewhere else?
Paige Roberts, Open Source Relations Manager, Vertica
Thursday, May 11: 3:00 p.m. - 3:45 p.m.
The cloud is not a monolithic entity, but exists in many forms.
Data platform teams are increasingly challenged with accessing multiple data stores that are separated from compute engines, such as Spark, Presto, TensorFlow, or PyTorch. Whether your data is distributed across multiple data centers and clouds, a successful heterogeneous data platform requires efficient data access. Palmer shares an architecture to overcome data silos, based on open source tools that simplify the adoption of the hybrid and multi-cloud for analytics and AI/ML workloads.
Greg Palmer, Lead Solutions Engineer, Alluxio, Inc.
Thursday, May 11: 10:45 a.m. - 11:30 a.m.
Database migration methodologies align with cloud and database trends.
Learn about the latest trends in database cloud modernizations, including the cloud-native databases offered by AWS and Azure. Agarwal has tips to help you pick the right database for your modern application and analytics needs, including use cases and comparing IaaS and PaaS databases.
Michael Agarwal, Director and Global Practice Leader, Site Reliability Engineering, Cloud & NoSQL Databases, Datavail
Modern SaaS applications are data-intensive, use AI, and need to drive fast, interactive analytics on operational data in real time for thousands of users. But how do you power an interactive application of the future? The next generation of unified databases provides a new opportunity to dramatically simplify their data architecture and power applications that are 100 times faster while reducing costs by 50%–60%. SingleStore’s technology runs as a cloud service or self-hosted software and supports multiple data models in a single system, including structured relational data, JSON, full-text, time series, spatial, key/value, and vector data. Learn how customers are unifying data powering some of the most innovative applications today!
Ian Gershuny, Principal Solutions Engineer, SingleStore
Thursday, May 11: 11:45 a.m. - 12:30 p.m.
DevOps is not yet without its challenges, but they are solvable.
Just as DevOps is designed to improve the software development and deployment lifecycle, DataOps is an approach to quickly deliver data and accelerate deployment of analytics solutions. Both the democratization of analytics as well as the implementation of specialized "built-for-purpose" tools are driving significant change in infrastructure and approach to data at organizations regardless of size. If you are encountering obstacles in your data analytics processes or practices, come learn about the key differences between DevOps and DataOps practices that can help lead your organization to better success in your data and analytics work.
Frank Cervone, Program Coordinator, Information Science and Data Analytics, San Jose State University
Thursday, May 11: 2:00 p.m. - 2:45 p.m.
Learn game-changing inference insights and concrete hacks to significantly increase performance across your AI pipeline.
From his background in serving and deploying models at scale, Millon shares an in-depth case study of an inference pipeline serving hundreds of models on multi-cloud clusters. He provides some best practices for optimizing MLOps and AI inference pipelines. He discusses accelerating model loading, asynchronous hot-swapping, multi-cloud and multi-region Kubernetes clusters, and optimizing spin-up times and autoscaling. Balancing in-house technical complexity and managed solutions is key. He reveals not only success stories in his journey to designing Krea’s back end, but also details some mistakes.
Erwann Millon, Founding Engineer, Krea.ai
Thursday, May 11: 3:00 p.m. - 3:45 p.m.
Open source databases are becoming more common, but how do they compare with proprietary databases?
Deployed, maintained, and scaled properly, open source databases can be one of the best decisions—both strategically and economically—that organizations make. But open source databases are not without challenges, and you have to know what to look for when evaluating open source alternatives. Will open source database resources and support match what you’ve come to expect from proprietary databases? Is there enough standardization? Can you switch between different databases and integrate them with existing software? Inamdar offers specific strategies for how and where to vet open source databases.
Anil Inamdar, VP & Head of Data Solutions, Instaclustr @ NetApp
Thursday, May 11: 10:45 a.m. - 11:30 a.m.
Integrating AI into business operations is accelerating, thanks to cloud computing.
The challenge in embedding AI into businesses involves how to do it in an agile, scalable, and cost-effective way. The organizations that will have a competitive advantage are those that take an end-to-end approach with hybrid cloud AI that includes both AI core infrastructure and AI edge infrastructure for easier management and rapid deployment. In this panel session, Voruganati and Kelleher speak to the latest technological advances and field learnings in scaling AI/ML workloads.
Rory Kelleher, Director, Global Business Development for Healthcare, NVIDIA
Kaladhar Voruganti, Senior Technologist, Office of the Chief Revenue Officer, Equinix
Thursday, May 11: 11:45 a.m. - 12:30 p.m.
The promise of AI extends beyond business processes, having real impacts in healthcare.
Thalassemia and sickle cell disease are rare blood disorders that can have devastating effects on the patient. Both diseases are inherited abnormalities that impact the ability of hemoglobin to carry oxygen throughout the body. Advances in neural networks, combined with large volumes of rare disease healthcare datasets, now can successfully identify undiagnosed thalassemia patients and predict a sickle cell crisis with a high degree of precision.
Danita Kiser, Vice President, Research Collaborations, Optum
Thursday, May 11: 2:00 p.m. - 2:45 p.m.
Digital innovation is a corporate necessity but with the newest developments in AI-based technologies, how much do C-level executives and board members understand?
Before attempting to initiate digital improvements, your organization must improve its data and technology decision making. Aiken and Cesino provide executive insight on appropriate investments, key talent, and focused implementation. These permit organizations to rapidly realize concrete top- and bottom-line improvements directly attributable to maturing data practices as well as keeping regulators at bay. Balanced people and process investments allow your organization to best use valuable data and new technologies to achieve a sustained competitive advantage.
Peter Aiken, Associate Professor of Information Systems, Virginia Commonwealth University and Anything Awesome
Michael Cesino, CEO, Visible Systems Corporation
Thursday, May 11: 3:00 p.m. - 3:45 p.m.
This panel of experts focuses not only on what has transpired during the conference but what we can expect going forward.
Marydee Ojala, Editor, Online Searcher, Computers in Libraries Magazine, & Editor-in-Chief, KMWorld Magazine
Generative AI, more intelligent machine learning, quantum computing, algorithmic determinations of behavior, and other technological advances are changing how we work and what we work with. This panel considers the implications of technology advances for good and bad.
Marydee Ojala, Editor, Online Searcher, Computers in Libraries Magazine, & Editor-in-Chief, KMWorld Magazine
Thursday, May 11: 4:00 p.m. - 5:00 p.m.
With all the hype surrounding modern data architectures today, it’s harder than ever to discern what’s authentic from market noise. In order to guide people, Radiant Advisors and Database Trends and Applications designed and conducted a market survey in Q1 2023 that collected data and analyzed what companies are doing. The study focused on the perceptions, planning, and adoption of cloud architecture, the data lake house, data mesh, data fabric, streaming data architectures, and robust multi-modal analytics across industries and company sizes. See how your thoughts from this conference are validated and compared with the insights from this market study.
John O'Brien, Principal Advisor & Industry Analyst, Radiant Advisors