8:45 AM
Keynotes
Length: 45 Minutes
Description: Stonebraker focuses on the current market for Big Data products, specifically those that deal with one or more of “the 3 V’s.” On the one hand, the Volume problem for business intelligence applications is pretty well solved by data warehouse vendors. However, upcoming data science tasks are poorly supported at present. On the other hand, there is rapid technological progress, so we need to stay tuned. In the Velocity arena, recent “new SQL” and stream processing products are doing a good job, albeit with some storm clouds on the horizon. The Variety space has a collection of mature products, along with considerable innovation from startups. He identifies opportunities, particularly those enabled by possible disruption from new technology. And then there’s that 800-pound gorilla in the corner.
10:45 AM
Moving to a Modern Data Architecture
Length: 1 Hour
Description: There is a conflict going in the IT industry pitting Big Data against “legacy” data architectures. However, traditional architecture and technologies and newer Big Data approaches each offer advantages.
Title: Designing a Data Architecture for Modern Business Intelligence & Analytics
Time: 10:45 AM - 11:45 AM
Description: Find out what you need to know regarding structured, unstructured, and semi-structured data; hybrid integration and data engineering; and different analytical uses. Learn about the technologies to use for different projects, including relational, columnar, and in-memory. Topics include the must-have underlying foundational concepts for every project; high-level architectural design to pull it all together; the best and pragmatic practices to ensure success; how to avoid deadly data and integration silos; and how to prevent data swamps, data shadow systems, and spreadmarts.
Title: Enabling the Real-Time Enterprise—Data Lakes, Streaming, & the Cloud
Time: 10:45 AM - 11:45 AM
Description: To support a modern data architecture and approach to analytics, data integration strategies now support on-prem, cloud, and hybrid deployments. Meanwhile, streaming architectures featuring change data capture (CDC) technology are rapidly being embraced to process data in motion. This session discusses the new requirements and best practices to be successful in enabling a real-time enterprise, whether in a data lake, via streaming technology, or in the cloud.
Competing on Analytics
Length: 1 Hour
Description: AI and Big Data offer seemingly unlimited potential for organizations to better understand their customers, make more informed decisions, and address challenges with greater agility. It’s important to understand the choices available to achieve the best outcomes.
Title: Applied Analytics: From BI to AI
Time: 10:45 AM - 11:45 AM
Description: The intersection of AI and Big Data provides the ability to deliver more targeted, timely, relevant insight in a pervasive and intuitive manner. However, delivering that simplicity requires an analytics and data ecosystem that is markedly more complicated than 10 years ago. To that end, effectively deploying analytics from BI to AI is a now an exercise in portfolio management—complete with discrete customer segments, diverse data environments, development methods, and a wide spectrum of deployment options. This session puts the diverse—and growing—landscape of analytics capabilities from BI to AI into context.
Title: How to Build Data Science Teams that Deliver Business Value
Time: 10:45 AM - 11:45 AM
Description: In spite of the buzz around AI, organizations are struggling to build data science teams that deliver value on the ground. This talk presents the three distinctive phases of growth for data science teams, highlighting potential challenges and suggesting a standard framework of guidelines to successfully navigate this evolution. Vastly different approaches are needed in each stage of maturity to tackle aspects such as strategic direction, project framework, the mix of skills, hiring strategies, and fostering of a data culture.
Data Lake Boot Camp
Length: 1 Hour
Description: A new data platform approach is needed to extend the data warehouse and address the vast quantity and variety of data flowing into organizations, much of it unstructured.
Title: The Data Warehouse Is Dead
Time: 10:45 AM - 11:45 AM
Description: The data warehouse is experiencing pressure from increasing data volumes, more users, and tight budgets—a triple threat to its ongoing existence and value. In addition, new data types are coming into play. This increased pressure means the old-school data warehouse may not be delivering insights at the speed of business. There are a number of alternatives to meet modern analytics infrastructure needs. This presentation outlines in detail why a modern data platform is required to deliver on new analytics demands.
Title: Exploiting Enterprise Data for Transformational Projects
Time: 10:45 AM - 11:45 AM
Description: In a data fabric, the data discovery and integration layer maps all enterprise data in its original business context so that users can find and blend data from diverse siloed sources into analytic-ready data sets on an on-demand basis. Join Cambridge Semantics CTO Sean Martin to hear how companies are using data discovery and integration solutions to exploit enterprise data fortransformational analytic and machine learning projects.
Cognitive Computing & AI Summit
Length: 1 Hour
Speaker(s):
Seth Earley, CEO,Earley Information Science
Description: Artificial intelligence (AI) had the potential to completely revolutionize how we do business and increasingly affects people’s daily lives.
Title: No AI Without IA
Time: 10:45 AM - 11:45 AM
Description: Depending on who you talk to, AI will either enable massive productivity gains from your employees or replace them entirely. Hype aside, AI is coming, and companies need to understand how to harness it. Despite the promise of “plug and play” technology, real AI requires varying degrees of information architecture (IA), knowledge engineering, product and content architecture, and high-quality data sources to be effective.
12:00 PM
Moving to a Modern Data Architecture
Length: 45 Minutes
Speaker(s):
Craig S. Mullins, President & Principal Consultant,Mullins Consulting, Inc.
Description: There are game-changing technologies emerging in data management. But to win at the new world of Big Data, you have to know the changing rules.
Title: Database Trends 2020
Time: 12:00 PM - 12:45 PM
Description: The world of data management and administration is rapidly changing as organizations digitally transform. Mullins examines how database management systems are changing and adapting to modern IT needs. Understanding the trends occurring now and on the horizon is critical to being prepared for the rapidly changing data landscape. This presentation looks at cloud, analytics, NoSQL, IoT, in-memory, and DevOps and examines what is happening with DBAs and their roles within modern organizations. Mullins backs up the trends with references and links where appropriate.
Competing on Analytics
Length: 45 Minutes
Description: Emerging technologies such as AI, IoT, and machine learning are changing what is knowable about customers. At the same time, the frequency of data misuse is leading government entities and individuals to demand higher standards of accountability.
Title: Ethics, Data Ownership, & Privacy in Data Science
Time: 12:00 PM - 12:45 PM
Description: This presentation explores the issues around modernizing security and governance, as well as what it means to deliver transparency and what users actually expect. It also covers the need to manage accountability within systems of multiple decision-makers; why it is necessary to build fairness into the system to overcome bias, discrimination, and enable diversity; and the need to address expectations of privacy and appropriate use of data.
Title: Accelerating Analytics in a New Era of Data
Time: 12:00 PM - 12:45 PM
Description: Due to exponentially growing data stores, organizations today are facing slowdowns and bottlenecks at peak processing times, with queries taking hours or days. Some complex queries simply cannot be executed. Data often requires tedious and time-consuming preparation before queries can be run. This session will demonstrate how the power of GPUs can help conquer these challenges, enabling data professionals to rapidly analyze more data on more dimensions, for previously unobtainable business insights.
Data Lake Boot Camp
Length: 45 Minutes
Speaker(s):
Subhayan Das, Associate Director-Digital Capability Management,Bristol-Myers Squibb
Description: With the abundance of data stored in data lakes, finding the relevant information is increasingly challenging, particularly in light of the many formats in which the data apears.
Title: Data Discovery, Selection, and Provisioning
Time: 12:00 PM - 12:45 PM
Description: With the realization of the power of data lakes, more and more organizational data in various formats and standards are being made available there. Given this plethora of information, it is becoming increasingly daunting for users to search for the data of interest to them with the use of conventional data analytical tools. A combination of Data Discovery tools, making use of semantic search and concept search, brings in the right blend of capability, enabling "comparison shopping" between seemingly similar datasets, and allowing end users to evaluate the best fit while facilitating the discovery and reuse of all available information and data assets, both internal and external.
Cognitive Computing & AI Summit
Length: 45 Minutes
Description: The power of machine learning is particularly evident when used to predict events in the real world.
Title: Machine Learning in the Real World
Time: 12:00 PM - 12:45 PM
Description: Machine learning (ML) has become top of mind for many businesses. Wilde shares his experience and insights about ML in the real world.
2:00 PM
Moving to a Modern Data Architecture
Length: 45 Minutes
Description: It was hard enough to manage IT infrastructures when everything was on-premise only. But today, with combined on-premise deployments, SaaS, and hybrid cloud scenarios, there is uncertainty about the proper way to license software in these very complex environments.
Title: Straight Talk on the Cloud License Landscape
Time: 2:00 PM - 2:45 PM
Description: Keeping software in compliance is a more significant challenge today than ever before. Sorting through all the FUD (fear, uncertainty, doubt) and getting straight answers from the vendors on the proper way to license software in this complicated world is nearly impossible. Making matters worse is the fact that many software vendors have turned to software license audits as an easy way to generate additional revenues. This session covers current software licensing trends, important lessons learned from the real world, and the steps every organization should take now to avoid becoming a victim of a software license audit whose real purpose is to generate revenue.
Competing on Analytics
Length: 45 Minutes
Description: Organizations in all industries are under pressure to take advantage of Big Data and newer data sources for real-time decision making in mission-critical environments. New technologies provide opportunities to gain insight into the future.
Title: Fannie Mae’s Journey to a Data-Driven Organization
Time: 2:00 PM - 2:45 PM
Description: How does an organization evolve from an application-centric to a data-driven enterprise? This presentation covers how Fannie Mae embarked on a major transformation journey to modernize its data infrastructure, transitioning from legacy data platforms to more integrated and scalable architecture to capitalize on the growing opportunities of the analytics economy and generate substantial business value, internally and externally.
Title: Riding the Waves of Big Data Disruption: Machine Learning, Cloud Analytics, IoT, and More
Time: 2:00 PM - 2:45 PM
Description: As Big Data grows and evolves, your enterprise faces both challenges and market-disrupting opportunities to analyze and manage larger data volumes for business value. But with seemingly endless commerical, open source, and "as-a-service" offerings hitting the market each week. How do you choose the right mix of technologies and avoid creating an accidental architecture that will limit you from future innovaation? How are organizations actually achieving true bottom-line benefits from their Big Data initiatives? Learn how to adopt an effective and agile approach to Big Data analytics.
Data Lake Boot Camp
Length: 45 Minutes
Description: As part of our deep dive into data lakes, our panel of experts contemplates success factors, failure avoidance, and new developments. Join us for an invigorating discussion.
Cognitive Computing & AI Summit
Length: 45 Minutes
Description: The customer experience (CX) is prime territory for employing AI technologies.
Title: AI-Driven CX
Time: 2:00 PM - 2:45 PM
Description: Let’s take a look at enhanced customer experiences for hospitality consumers delivered via virtual concierge services that provide resort guests with personal and contextual hospitality services. Built upon mobile use and location data, bot services help customers with a variety of services, ranging from in-room amenities and dinner reservations to event tickets. Don Spaulding provides an overview of the technical solution and powerful results of this virtualized solution.
3:15 PM
Moving to a Modern Data Architecture
Length: 45 Minutes
Speaker(s):
Danil Zburivsky, Director of Engineering, Kick Analytics-as-a-Service,The Pythian Group Paul Wolmering, VP Worldwide Sales Engineering,Actian Corporation
Description: Data is flowing into organizations from a previously unimaginable array of sources and at unprecedented speed and volume. This means that the challenges of cleaning, deduplicating, and integrating data are increasing.
Title: Dismantling Data Silos Through Cloud Integration
Time: 3:15 PM - 4:00 PM
Description: A cloud-native data platform may be the best way for organizations to cost-effectively deliver on the promise of better insights and more intelligent systems through data. Danil Zburivsky covers how a cloud integration approach can lead to better data governance and more accurate analysis and ensure consistency of data across systems, as well as the best practices for cloud data integration and how a cloud data platform breaks down data silos within the organization. The presentation also looks at how one client successfully took its global sales data to the cloud to uncover new opportunities.
Title: Diving Under the Hood of Actian Avalanche, a Gen III Cloud Data Warehouse
Time: 3:15 PM - 4:00 PM
Description: From the perspective of an experienced engineering thought leader, Paul Wolmering, VP Worldwide Sales Engineering, Actian Corporation, will deliver a deep dive into Actian’s newly launched Gen III cloud data warehouse. Learn about key considerations for building a fully managed, multi-cloud data warehouse with federated query capabilities, that’s built for hybrid data. Understand key success factors for migrating to columnar analytics to gain actionable insights from an operational data warehouse. Learn what it takes to deliver insights from real-time data economically and at scale with hybrid data regardless of location, in the cloud, on-premises or both.
Competing on Analytics
Length: 45 Minutes
Speaker(s):
Robin Rappaport, Senior Operations Research Analyst,IRS-RAAS (Research, Applied Analytics, and Statistics)
Description: With the vast quantities of data flowing into organizations, the job of cleansing and validating data is only becoming more difficult. In order to gain the kind of insights and outcomes that organizations seek, new processes and technologies must be deployed.
Title: Flipping the 80/20 Rule of Data Prep and Analysis
Time: 3:15 PM - 4:00 PM
Description: The (IRS) Compliance Data Warehouse (CDW) is an analytical data warehouse used for research purposes. It empowers researchers to spend more time on analytics and less on data wrangling. To ensure all data is loaded properly, consistent, well-thought-out validation steps must be included in the ETL process. This presentation offers a case study of accomplishments and lessons learned (since FY 2016), including the data quality issues identified by CDW users (data stewards), and takeaways for attendees on how to improve decision making.
Data Lake Boot Camp
Length: 45 Minutes
Description: Data lakes are highly appealing as they provide the capacity to support all types of data and maintain it in its original format for future purposes. Before diving in, it’s important to be aware of the components of a successful data lake implementation.
Title: Uber’s Hadoop Data Ingestion and Dispersal Framework
Time: 3:15 PM - 4:00 PM
Description: Marmaray, Uber’s general-purpose Apache Hadoop data ingestion and dispersal framework and library, was open-sourced in 2018. Marmaray was envisioned, designed, and ultimately released in late 2017 to fulfill the need for a flexible, universal dispersal platform that would complete the Hadoop ecosystem by providing the means to transfer Hadoop data out to any online data store. Before Marmaray, each team was building its own ad hoc dispersal systems, which resulted in duplicated efforts and an inefficient use of engineering resources.
Title: Building a Data Lake in Two Weeks Without Writing Code
Time: 3:15 PM - 4:00 PM
Description: Rafael shares how an ad-tech company moved from a DW to a fully functional data lake that processes over 400,000 events per second without writing one line of code.
Cognitive Computing & AI Summit
Length: 45 Minutes
Description: The combination of Big Data with AI technologies creates both challenges and opportunities. Identifying the factors that turn challenges to opportunities leads to success.
Title: AI On the Edge
Time: 3:15 PM - 4:00 PM
Description: With edge computing becoming a thing, AI on-the-edge is quickly following suit. It unlocks a whole new world of possibilities, including predicting customer needs before they even know them. But edge AI seems like it’s only a game for the most cutting-edge companies like Apple, Amazon, or Tesla, to name a few. Traditional enterprises aren’t really embracing it out of fear it may cost too much or due to uncertainty about the potential ROI. To tap into this opportunity, organizations don’t need to choose a risky “all in” approach; a small iterative approach reduces the risk while ensuring your edge AI projects aligns with your overall business strategy. Join this session to learn how to apply a Minimal Viable Prediction (MVP) approach to your next edge AI project.
4:15 PM
Moving to a Modern Data Architecture
Length: 45 Minutes
Speaker(s):
Jeff Crume, Distinguished Engineer, IT Security Architect,IBM
Description: There are significant benefits offered by IoT, but also new threats and dangers. Do we really understand the challenges posed by all these connected “things”?
Title: The Dark Side of the Internet of Things
Time: 4:15 PM - 5:00 PM
Description: With the Internet of Things (IoT), essentially everything becomes a computer. This means that everything can be hacked—including cars, home appliances, medical devices, and more. This presentation provides examples of IoT hacks and the consequences of not getting security right as we move forward in the world of smart and connected machines.
Competing on Analytics
Length: 45 Minutes
Description: Best-selling author David Weinberger previews his new book on everyday chaos.
Title: How Machine Learning Is Changing the Future as a Fact and as an Idea
Time: 4:15 PM - 5:00 PM
Description: Ultimately, machine learning’s most important effect may not be in the benefits its use brings, but how it is implicitly transforming our understanding of how the world works and our most basic strategies for dealing with the future. From Newton on through the Computer Age, we have assumed that the universe is ruled by a relative handful of laws that are the same everywhere and that are simple enough for us to understand. But machine learning shows us a world of motes of data in networks so dense with connections and so delicately balanced, we sometimes can’t understand them. This sort of model of the world is changing not only our strategies, but our moral sense, our ideas about meaning, and even what makes humans special.
Data Lake Boot Camp
Length: 45 Minutes
Description: New frameworks and platforms are enabling organizations to improve time-consuming processes and meet enterprise requirements for high performance.
Title: Tackling Data Ingestion Challenges at LinkedIn With Apache Gobblin
Time: 4:15 PM - 5:00 PM
Description: Apache Gobblin is a distributed data integration framework for both streaming and batch data ecosystems. This presentation covers how Gobblin powers several data processing pipelines at LinkedIn and use cases such as ingestion of more than 300 billion events for thousands of Kafka topics on a daily basis, metadata and storage management for several petabytes of data on HDFS, and near real-time processing of thousands of enterprise customer jobs. It also looks at the key Gobblin features that help LinkedIn build and run these data pipelines at extreme scale.
Cognitive Computing & AI Summit
Length: 45 Minutes
Description: Everyone’s talking about machine learning, but we hear much less about how to put it into practice.
Title: Exploring Machine Learning on the Google Cloud Platform
Time: 4:15 PM - 5:00 PM
Description: Only 10 years ago, you needed access to extensive academic and computing resources to make use of machine learning (ML). Fast-forward to today, and we’ve seen revolutionary changes in the hardware and software that are making ML accessible for any developer or data scientist. Whether you’re completely new to ML or you’ve already trained and deployed your own model from scratch, Google Cloud Platform has a variety of tools to help you start using ML right now. Sara Robinson starts with the basics: how to use a pre-trained ML model with one REST API call. Then she explains how to use your own dataset to customize a pre-trained model with transfer learning, how to build your own model from scratch with TensorFlow, and how to train and serve it in the cloud with GCP.