Data Summit 2018 Program

Monday, May 21

Preconference Workshops - Morning

W1. Introduction to Hadoop

Monday, May 21: 9:00 a.m. - 12:00 p.m.

Hadoop has forever changed the economics and dynamics of largescale computing, and its use among enterprises looking to augment their traditional data warehouses continues to grow. Join this workshop to explore the basics of Hadoop, including the Hadoop Distributed File System, MapReduce, and the budding ecosystem of Hadoop software projects. Learn best practices for installing and configuring Hadoop in your environment, managing its performance, and developing Big Data applications.

Speaker:

Marco Vasquez, Senior Technical Director, MapR

W2. Introduction to Data Science

Monday, May 21: 9:00 a.m. - 12:00 p.m.

Data science, the ability to sift through massive amounts of data to discover hidden patterns and predict future trends, may be the “sexiest” job of the 21st century, but it requires an understanding of many different elements of data analysis. This workshop dives into the fundamentals of data exploration, mining, and preparation, applying the principles of statistical modeling and data visualization in real-world applications.

Speaker:

Joe Caserta, Founding President, Caserta

Monday, May 21

Preconference Workshops - Afternoon

W3. Introduction to Cognitive Computing

Monday, May 21: 1:30 p.m. - 4:30 p.m.

Cognitive computing encompasses many technologies that advance the intelligent processing of high-level information. Applications in a variety of situations require tailoring the technologies to business requirements. This workshop introduces the components of cognitive computing for business uses.

Speaker:

Hadley Reynolds, Co-founder, Cognitive Computing Consortium

W4. Introduction to Machine Learning

Monday, May 21: 1:30 p.m. - 4:30 p.m.

From recommender systems to disease diagnosis, machine learning is revolutionizing the process of complex decision making by enabling the analysis of bigger, more complex datasets and the delivery of faster, more accurate results. This workshop examines the statistical and algorithmic principles for developing scalable, real-world machine learning pipelines and applications.

Speaker:

Marina Johnson, Assistant Professor of Analytics, Information Management/Business Analytics, Montclair State University and Drexel University

W5. Introduction to Blockchain

Monday, May 21: 1:30 p.m. - 4:30 p.m.

Blockchain, most widely known as the underlying technology for cryptocurrencies, transcends that narrow niche. As a distributed ledger system, blockchain technology can benefit government and corporate transactions as well. Learn the basics of blockchain in this workshop.

Speaker:

Paul A Tatro, Founder, Blockchain U Online

Tuesday, May 22

Keynotes

Welcome & Keynote - Once We Know Everything

Tuesday, May 22: 9:00 a.m. - 9:45 a.m.

We, of course, will never know everything. But with the arrival of Big Data, machine learning, data interoperability, and all-to-all connections, our machines are changing the long-settled basics of what we know, how we know, and what we do with what we know. Our old—ancient—strategy was to find ways to narrow knowledge down to what our 3-pound brains could manage. Now it’s cheaper to include it all than to try to filter it on the way in. But in connecting all those tiny datapoints, we are finding that the world is far more complex, delicately balanced, and unruly than we’d imagined. This is leading us to switch our fundamental strategies from preparing to unanticipating, from explaining to optimizing, from looking for causality to increasing interoperability. The risks are legion, as we have all been told over and over. But the change is epochal, and the opportunities are transformative.

Speaker:

David Weinberger, Harvard metaLAB and Harvard Berkman Klein Center

Track A: Moving to a Modern Data Architecture

Moderator:

Dino Eliopulos, Managing Director, Earley Information Science

A101. Modern Data Architecture Design

Tuesday, May 22: 10:45 a.m. - 11:45 a.m.

Modern architecture has evolved beyond the traditional data warehouse to include logical data warehouses, data lakes, distribution hubs, data catalogs, analytical sandboxes, and data science hubs, along with both self-service data preparation and BI.

Designing a Data Architecture for Modern Business Intelligence & Analytics

10:45 a.m. - 11:45 a.m.

This session looks at the data architecture for modern business intelligence and analytics which must support structured, unstructured, and semi-structured sources and hybrid integration and data engineering as well as analytical uses by casual information consumers, power users, and data scientists. Technologies include databases (relational, columnar, in-memory, and NoSQL); hybrid data, application, and cloud integration; data preparation; data virtualization; descriptive, diagnostic, predictive, and prescriptive analytics; and on-premise and on-cloud deployments.

Speaker:

Richard Sherman, Managing Partner, Athena IT Solutions

Polyglot Persistence Versus Multi-Model Databases

10:45 a.m. - 11:45 a.m.

Relational, columnar, object, XML, and graph databases are flourishing, but many applications need all of these for different capabilities. Those who tout "polyglot persistence" insist that one size cannot fit all and focus on integrating multiple data stores. At the same time, the multi-model database is on the rise, and most leading operational DBMSs offer multiple data models. Fried considers the pros and cons, what is possible, what is best for performance, and what is practical. Should you use a multi-model database in your next project?

Speaker:

Jeff Fried, Director, Platform Strategy & Innovation, InterSystems

A102. Making the Shift From Relational to NoSQL

Tuesday, May 22: 12:00 p.m. - 12:45 p.m.

Enterprises today are competing on analytics, and this requires the right combination of technologies. Increasingly, that means a combination of data management systems spanning NoSQL and relational, cloud, and on-premise.

NoSQL Concepts for the Relational DBA

12:00 p.m. - 12:45 p.m.

More and more, DBAs who have traditionally managed relational database systems such as Oracle and SQL Server are being tasked with managing companies’ "non-relational" databases. Platforms such as MongoDB and Cassandra are coming under the management of enterprise IT, requiring a new set of skills for their existing teams. Hall explores concepts familiar to relational DBAs, such as data modeling, high availability, and scalability, and discusses how those concepts translate into NoSQL platforms.

Speaker:

Jason Hall, Senior Solutions Architect, Quest Software

A103. IoT Data Strategies

Tuesday, May 22: 2:00 p.m. - 2:45 p.m.

IoT environments often pose a data management problem because of the huge volumes of data that are created and the latencies inherent in having global distribution.

IoT Data: From the Edge to the Cloud and Back

2:00 p.m. - 2:45 p.m.

The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next-gen applications and how to address the challenges of building applications that harness all data types and sources.

Speaker:

Will Ochandarena, Senior Director, Product Management, MapR Technologies

Top Killer Use Cases of IoT Analytics with Machine Learning at Scale

2:00 p.m. - 2:45 p.m.

Does the thought of capturing massive volumes of streaming sensor data from your hundreds or thousands of connected assets give you heartburn? Perhaps you are considering how to implement analytics at the edge to relieve your data warehouse. Maybe you are storing time-series data in your Hadoop or S3 data lake as you build a solid and strategic IoT business case. No matter where you are on the IoT analytics curve, the Vertica analytical database provides leading intelligent device manufacturers with the highest levels of query performance and massively scalable in-database Machine Learning and analytics to derive bottom-line business value.

Speaker:

Jeff Healey, Senior Director, Vertica Product Marketing, Micro Focus

A104. Overcoming Big Data Integration Challenges

Tuesday, May 22: 3:15 p.m. - 4:00 p.m.

With the rise of Big Data, there is the need to leverage a wider variety of data sources as quickly as possible for real-time decision making in mission-critical environments.

Data Acquisition to Support Trading Data Analytics

3:15 p.m. - 4:00 p.m.

This technical presentation shows how a global investment management firm architected continuous data feeds using data integration technology so that it could enable real-time data analytics for best execution. It covers the cloud-based trading data analytics platform that leverages HVR as a key, real-time data ingestion tool.

Speaker:

Joseph deBuzna, VP Field Engineering, HVR

Automating Data Architecture Design

3:15 p.m. - 4:00 p.m.

Traditionally, the selection and user of a data integration platform presupposes the existence of a trusted, proven, and accepted data architecture, However, these platforms can be just as valuable developing data architectures as they are in operating them.

Speaker:

Kevin Scott, Principal Sales Engineer, CloverETL

A105. The Rise of Graphs & Enterprise Data Fabric

Tuesday, May 22: 4:15 p.m. - 5:00 p.m.

With the rapidly increasing amount of data being generated, organizations spanning a range of industries are undergoing a digital transformation to analyze and query large amounts of data at high speeds.

The Rise of Graph Databases: From Thomson Reuters to Amazon Neptune

4:15 p.m. - 5:00 p.m.

The recent announcements of Thomson Reuters and Amazon Neptune providing knowledge graphs to their customers validate the effectiveness of using graph-based information as opposed to traditional databases to derive insights and business value. Martin analyzes why Thomson Reuters and Amazon Neptune have turned to graph databases as disruptive and necessary technologies and explains how companies in data-intensive industries, such as financial services, healthcare, pharmaceutical, and oil and gas, can use graph-based technologies as a new marketing strategy to maintain existing customers and attract new ones.

Speaker:

Sean Martin, CTO, Cambridge Semantics

Reversing and Querying Complex Graph Data in Real Time

4:15 p.m. - 5:00 p.m.

We live in a data-driven world, where the data flowing through systems is extremely complex, constantly changing, large in volume, and highly connected. Many use cases such as fraud detection and prevention, customer 360, and IoT network management involve capturing and analyzing massive amounts of highly connected data to identify hidden relationships and patterns. By unlocking the values in those relationships, you gain contextual insights of your data.

Traditional relational database and other NoSQL systems are not suited for those types of use cases because the technologies are primarily focused on the entities as opposed to the relationships. This is where graph databases come in so handy. They make it easy to discover, explore, and make sense of complex relationships. By leveraging the insights in data relationships you can deliver more relevant, real-time experiences for your customers, proactively fight fraud, and ensure the health and seamless operations of your network.

Heath explores how DataStax Enterprise (DSE) Graph, coupled with Expero’s expertise in graph and analytics, empowers users to explore and visualize complex graph data in innovative and meaningful ways.

Speaker:

Scott Heath, CRO, Expero

Tuesday, May 22

Track B: Competing on Analytics

Moderator:

Lindy Ryan, Professor & Research Faculty, Montclair State University; Rutgers University

B101. Becoming a Data-Driven Enterprise

Tuesday, May 22: 10:45 a.m. - 11:45 a.m.

The promises of Big Data are many, including the ability to react faster to opportunity and risk, to achieve a panoramic view of customers, and to use client data to create hyper-personalized experiences.

Five Executive Imperatives to Compete in the Analytics Economy

10:45 a.m. - 11:45 a.m.

The demand to become a data-driven business with a competitive edge in the digital economy is greater now than ever. As we embrace the idea that the analytics economy will power the digital economy by compounding the value of data and analytics assets, executives must know where to focus their leadership efforts. This session takes a close look at the five key factors necessary to capitalize on the growing opportunities of the analytics economy and generate substantial business value, internal and external.

Speaker:

Anne Buff, Business Solutions Manager, SAS Best Practices, SAS Institute

The Business of Data: Defining Your Strategy

10:45 a.m. - 11:45 a.m.

How can you reap the benefits of the massive data volumes now available to your organization? To truly harness the power of your data, you need a solid strategy that incorporates everything—from security to data governance to choosing the right technologies. Learn about the five key elements of a successful data strategy; what is driving the need for a new type of data platform (including analytics, data science, and machine learning); how a modern data platform can deliver self-service analytics, empowering more people with data; new use cases that are impacting how companies make technology choices; and what you can do now to take your data strategy to the next level.

Speaker:

Lynda Partner, VP, Products and Offerings, The Pythian Group

B102. A 360° Customer View

Tuesday, May 22: 12:00 p.m. - 12:45 p.m.

Companies are collecting treasure troves of information about their customers, but they need the right tools and technologies to connect that data to create the comprehensive 360° view that is critical to leveraging the data’s full value.

Overcoming Data Silos to Get a Unified View of Customers

12:00 p.m. - 12:45 p.m.

A 360° view of customers makes generating more of the right customers—who become successful, pay their bills on time, renew, and grow—possible. However, according to Gartner, less than 10% of companies aggregate data for a 360° customer view that can enable business growth. Pines provides a practical approach to achieving the 360° view of customers, along with the questions you should be asking of your business to ensure it’s on a growth path. You also learn about emerging tools and techniques that can be used to connect marketing, sales, support, and finance data to power your analytics with data that paints a holistic picture of your customers.

Speaker:

Zak Pines, VP Marketing, Bedrock Data

Achieving a 360 Customer View Starts with Addressing Intelligence

12:00 p.m. - 12:45 p.m.

Data intelligence about people, places and things, along with the connection between physical and digital addresses has a serious impact on understanding your customer. Knowing your customer all starts with knowing where they are located, detailed intelligence about what is around them, and how to reach them at the right time with the right message.

Speaker:

Dan Adams, VP Product Management, Data, Pitney Bowes

B103. Analytics in Action

Tuesday, May 22: 2:00 p.m. - 2:45 p.m.

Innovative organizations in a range of fields are putting data to work to predict what customers want, give them products they need, and engage with them more effectively.

Disrupting Paradigms With Analytics

2:00 p.m. - 2:45 p.m.

How can you use analytics to succeed? It may well be in different ways than you are accustomed to thinking about analytics. Analytics tools that provide actionable insights are the gold standard in today's Big Data (and even small data) world. Data intelligence enables advanced and agile analytics, digital enterprises, and robotics. Come learn from a team passionate about data.

Speakers:

Fabricio Silva, Informationist, Knowledgent

Yan Ge, Director, Data Analytics, Takeda Pharmaceuticals

Exponentially Growing Data Stores: Burden or Opportunity?

2:00 p.m. - 2:45 p.m.

Joel Sehr shares how to analyze more data, faster, at lower cost, whie reducing MPP overload. Leverage more comprehensive analytics for targeted product and service offerings.

Speaker:

Joel Sehr, VP, Americas, SQream

B104. Beating the Barriers to Better Analytics

Tuesday, May 22: 3:15 p.m. - 4:00 p.m.

Faster time to insight is driving the use of big data technologies, but far too much time is still spent preparing data for analysis. New approaches are available to help take down the barriers to analytics.

From Bottlenecks to Breakthroughs: The Rise of Data Wrangling

3:15 p.m. - 4:00 p.m.

Data preparation, or "data wrangling" as it's often referred to, is widely considered the biggest bottleneck in any analytics process—taking up more than 80% of the time and resources in any data project. Davis reviews the inefficiencies of traditional data preparation techniques and why there is a need for a new set of self-service tools to create new levels of analytics productivity. New data wrangling solutions combine the latest techniques in data visualization, human-computer interaction, and machine learning to enable a wider set of data workers to prep data themselves, as well as improve the speed and accuracy of these processes.

Speaker:

William Davis, Director, Product Marketing, Trifacta

Presentation Title TBD

3:15 p.m. - 4:00 p.m.

Speaker:

Jason Hall, Senior Solutions Architect, Quest Software

B105. The Future of Analytics

Tuesday, May 22: 4:15 p.m. - 5:00 p.m.

Collecting data is one thing; using it to make better decisions is another. Understanding which analytics approaches will yield the best results is critical.

The Future of Analytics is Visual

4:15 p.m. - 5:00 p.m.

Lindy Ryan reviews recent research she's done on the future of analytics and concludes that data visualization will be a major job skill needed by businesses of all types. Visualization helps people understand data and communicate the meaning of data to others. Although many data visualization tools exist, the one highest in popularity is Tableau.

Speaker:

Lindy Ryan, Professor & Research Faculty, Montclair State University; Rutgers University

Knowledge Analytics: Managing & Mitigating Risks

4:15 p.m. - 5:00 p.m.

A forthcoming text published by Taylor & Frances considers the intersection of business analytics and knowledge management. Most of the "data analytics" in use today has not changed in its essential methods—only in terms of the tools used. The application of stochastic/basic statistical methods to text and language is known to introduce risks for business decision makers, while the use of analytical methods with linguistic engines reduces the risks. It is only when knowledge elicitation and representation methods are combined with the use of linguistic/semantic tools that we significantly reduce risks. Bedford and McBreen consider eight recent applications based on stochastic methods that produced unreliable results, and then, in contrast, describe success stories—one from the private sector and one from the public sector— that have resulted from the combined method now referred to as knowledge analytics.

Speaker:

Denise A.D. Bedford, Faculty, Communication Culture and Technology, Georgetown University; & Author, Organizational Intelligence & Knowledge Analytics and York University, Coventry University

Tuesday, May 22

Track H: Hadoop Day

Moderator:

Joe McKendrick, Principal Researcher, Unisphere Research

H101. The Big Data Ecosystem Today

Tuesday, May 22: 10:45 a.m. - 11:45 a.m.

The expanding array of data, data types, and data management systems is making the enterprise data landscape more complicated. It is all about finding the right balance for data access and management.

SQL's Sequel: Hadoop & the Post-Relational Revolution

10:45 a.m. - 11:45 a.m.

We are now in the Big Data era, thanks to an explosion in the volume, velocity, and variety of data. We are also now in the post-relational era, thanks to a proliferation of options for handling Big Data more naturally and efficiently than relational database management systems (RDBMS). That’s not to say that we’re done with RDBMS; rather, that Big Data is better handled by technologies such as Hadoop, HBase, Cassandra, and MongoDB, which provide scale-out, massively parallel processing (MPP) architectures. This presentation discusses the rise of Hadoop and other MPP technologies and where they fit into an enterprise architecture in the Big Data era.

Speaker:

David Teplow, Founder & CEO, Integra Technology Consulting

SQL on Big Data—Technology, Architecture, & Innovations

10:45 a.m. - 11:45 a.m.

This comprehensive overview of SQL engines on Big Data focuses on low latency. SQL has been with us for more than 40 years and Big Data technologies for about 10 years. Both are here to stay. Pal covers how SQL engines are architected for processing structured, unstructured, and streaming data and the concepts behind them. He also covers the rapidly evolving landscape and innovations happening in the space—with products such as OLAP on Big Data, probabilistic SQL engines such as BlinkDB, HTAP-based solutions such as NuoDB, exciting solutions using GPUs with 40,000 cores to build massively parallel SQL engines for large-scale datasets with low latency, and the TPC-Benchmark 2.0 for evaluating the performance of SQL engines on Big Data.

Speaker:

Sumit Pal, Strategic Technology Director, Graphwise.ai and Independent Consultant

H102. Data Lake Best Practices

Tuesday, May 22: 12:00 p.m. - 12:45 p.m.

The concept of a data lake that encompasses data of all types is highly appealing. Before diving in, it is important to consider the key attributes of a successful data lake and the products and processes that make it possible.

The Data Lake Toolkit

12:00 p.m. - 12:45 p.m.

Making the case for collaboration and diverse analytical workloads are the two key goals when designing a data lake. Modern data lakes contain an incredible variety of datasets, varying in size, formats, quality, and update frequency. The only way to manage this complexity is to enable collaboration, which not only promotes reuse, but also enables the network effect that helps solve some of the vexing problems of quality and reusability. Given the scale and complexity of data, moving it outside of the lake is not only impractical but also expensive, so the data lake needs to support diverse needs and the resulting diverse workloads.

Speaker:

Mukund Deshpande, VP, Data Analytics, Accelerite

Smart Data Lakes, Knowledge Graphs, and the Semantic Layer: An Enterprise Information Fabric Road Map

12:00 p.m. - 12:45 p.m.

Only with a rich interactive semantic layer, based on knowledge graph technology and situated at the heart of the data lake, can organisations hope to delivery true on-demand access to all of the data, answers, and insights -- woven together as an enterprise information fabric.

Speaker:

Sean Martin, CTO, Cambridge Semantics

H103. The Hybrid Future of Big Data

Tuesday, May 22: 2:00 p.m. - 2:45 p.m.

Cutting-edge Big Data technologies are easily accessible in the cloud today. However, overcoming integration challenges and operationalizing, securing, governing, and enabling self-service usage in the cloud can still be vexing concerns, just as they are on-premise.

Accelerating Analytic Database Performance

2:00 p.m. - 2:45 p.m.

Database characteristics that impact query performance for BI and analytic use cases include the use of columnar structures, parallelization of operations, memory optimizations, and scaling to high numbers of concurrent users. Maguire also covers the requirements for handling updates for real-time analytics.

Speaker:

Walt Maguire, VP Systems Engineering, Actian

Three Questions You Aren't Asking That Will Make Your Data Strategy Hum

2:00 p.m. - 2:45 p.m.

What three important questions should business leaders consider asking the next time they need to make a technology decision for a data monetization project? Get your guidance from Joe de Buzna.

Speaker:

Joseph deBuzna, VP Field Engineering, HVR

H104. Unleashing Your Big Data With Spark

Tuesday, May 22: 3:15 p.m. - 4:00 p.m.

Big Data requires processing on a massive scale. Newer open source technologies such as Spark can help to enable Big Data processing for use cases that were previously unimaginable.

Building a Recommender System With Machine-Learning & Spark

3:15 p.m. - 4:00 p.m.

Outbrain is the world’s largest discovery platform, bringing personalized and relevant content to audiences while helping publishers understand their audiences through data. Outbrain uses a multiple-stage machine learning workflow over Spark to deliver personalized content recommendations to hundreds of millions of monthly users. This talk covers its journey toward solutions that would not compromise on scale or on model complexity and design of a dynamic framework that shortens the cycle between research and production. It also covers the different stages of the framework, including important takeaway lessons for data scientists as well as software engineers.

Speaker:

Shaked Bar, Tech Lead & Algorithm Engineer, Outbrain

H105. The Challenges & Tactics for Improving Analytic Performance

Tuesday, May 22: 4:15 p.m. - 5:00 p.m.

Walt Maguire introduces analytic case studies including one from Craig Strong, chief technology and product Officer at Hubble, who describes how Hubble is able to provide real-time corporate performance management (CPM) through high-speed analytics dashboards. Hubble’s dashboards draw from hybrid data sources to provide dynamic dashboards that allow for ad hoc query and analysis of near real-time corporate performance. The results from the performance tests that Hubble ran, which compared the performance seen from Actian Vector to a selection of databases including SQLServer, Mem SQL, SAP, Presto, Spark and RedShift, are presented along with the results of recent scaled, cloud-based databases and factors to consider including configurations, query complexity, database size, and concurrency, for such performance tests.

Speaker:

Walt Maguire, VP Systems Engineering, Actian

Tuesday, May 22

Track CS: Cognitive Computing Summit

Moderator:

Edee Edwards, Taxonomy & Ontology Manager, FM Global, USA

CS101. The Rise of AI

Tuesday, May 22: 10:45 a.m. - 11:45 a.m.

Artificial intelligence (AI) is finally coming into its own as cognitive computing becomes the norm.

Past, Present & Future: What Lies Ahead for AI

10:45 a.m. - 11:45 a.m.

While the industry is abuzz talking about the rise of artificial intelligence, the term itself is not new. In fact, the term “AI” was first coined in 1956 but fell off the radar after no monumental achievements were accomplished in the following years. But given recent advancements in analytics, visualization, and machine learning, artificial intelligence has re-emerged with a promising future. However, the question remains—will it succeed this time around?

Speaker:

Todd Sundsted, CTO, SumAll

Unlocking the Power of Data Science With Machine-Learning and AI

10:45 a.m. - 11:45 a.m.

How can you unlock the power of data science for data-driven decision support? From a business standpoint, Bulusu details how to use machine learning techniques for predictive analytics. He also illustrates how AI can meet BI (business intelligence) through industry use cases and talks about integrating results with a BI platform, using Oracle Advanced Analytics.

Speaker:

Lakshman Bulusu, Consultant & VP of Research, Matlen Silver, Qteria

CS102. The Path to Cognitive Search

Tuesday, May 22: 12:00 p.m. - 12:45 p.m.

Search is not a new activity for most of us, but cognitive search adds new dimensions and functionalities that enhance the UX.

How Machine Learning Changes the Search Paradigm

12:00 p.m. - 12:45 p.m.

Since 1994, when the first search engine was deployed, search has worked like this: A user entered search terms in a field, hit the search button, got a list of documents called search results, reviewed the list to find a few that might be relevant, downloaded one document, manually scanned it to see if it’s on point, and then returned to the list and continued the process until frustrated or out of time. Users are overwhelmed, tired, and wish they had a personal search assistant. They want to dispense with search altogether. They demand to know why the machine can’t just read all those documents for us and tell us what it finds. Well, now it can! Machine learning can change the entire search paradigm.

Speaker:

David Seuss, Founder & CEO, Northern Light

CS103. Machine Learning in the Real World

Tuesday, May 22: 2:00 p.m. - 2:45 p.m.

As one of the technologies getting loads of attention recently, machine learning has interesting applications for many types of enterprises.

Reuters News Tracer: The Journalist Bot in Training

2:00 p.m. - 2:45 p.m.

Reuters News Tracer is a capability that applies AI in journalism to find events breaking on Twitter. It assigns them a newsworthiness score so people can focus on the events that are important. The real magic of Reuters News Tracer is that it gives a confidence score about how likely it is that those events are true. This is really critical, given the rapidly changing landscape of news, including “fake news,” and the distrust of media reports.

Speaker:

John Duprey, Senior Architect, Thomson Reuters Labs' Center for AI & Cognitive Computing, Thomson Reuters

Modernizing Enterprise Search for Research Portals Through Machine Learning

2:00 p.m. - 2:45 p.m.

Access to information and insights is critical for today’s complex enterprises, unlike consumer experiences with Google, Alexa, or Siri, enterprise search is broken. Advancements in ML, NLP, and text analytics are changing the dynamic for enterprise employees. This session explores how one large, global bank is delivering a research portal for its analysts that understands not just what the user asks, but what is meant; uses intent to determine which ML relevancy model to apply; and creates a graphical, interactive results view that delivers actionable insight. The session concludes with an overview of the results the bank has experienced, and how it’s looking to extend the ML-based search capabilities into other areas.

Speaker:

Will Johnson, CTO and Co-Founder, Attivio

CS104. From Zero To Hero With TensorFlow

Tuesday, May 22: 3:15 p.m. - 4:00 p.m.

Another technology gaining traction in our cognitive computing world is deep learning, which changes many business processes.

Getting Started with TensorFlow and Deep Learning

3:15 p.m. - 4:00 p.m.

The hottest topic in computer science today is machine learning and deep neural networks. Many problems deemed "impossible" only 5 years ago have now been solved by deep learning—playing GO, recognizing what is in an image, and translating languages are but a few examples. Software engineers are eager to adopt these new technologies as soon as they come out of research labs and the goal of this session is to equip you to do so. This session focuses on two demos: using TensorFlow for linear regression (making a numerical prediction from inputs) and using a neural network to make predictions from text inputs. Along the way, I'll show some live demos and give you tips to apply these techniques in your own projects. No PhD required.

Speaker:

Sara Robinson, Developer Advocate, Google

CS105. PANEL: Beyond “Do No Harm”— The Complex Considerations of Ethical AI

Tuesday, May 22: 4:15 p.m. - 5:00 p.m.

While tech giants have attempted to allay public concerns about “inhuman” negative or conflicting behaviors on the part of AI applications by proposing pledges to be good, many stakeholders, including governments, corporations, researchers, nonprofits, auto companies, and consumers, are asking for true accountability for the actions of emerging intelligent systems. This panel of experts discusses the growing body of work by the academic, scientific, and standards communities geared specifically to expand and tighten our understanding of the ethical context implicit (and explicit) in AI applications, from diagnosing disease to driving autonomous vehicles.

Moderator:

Hadley Reynolds, Co-founder, Cognitive Computing Consortium

Speakers:

David Weinberger, Harvard metaLAB and Harvard Berkman Klein Center

Sara Mattingly-Jordan, Assistant Professor, Center for Public Administration & Policy, Virginia Tech

Wednesday, May 23

Keynotes

Keynote - Capturing Value With Cognitive Computing: Executing on Data Challenges

Wednesday, May 23: 8:45 a.m. - 9:30 a.m.

As the market moves from fascination with the wonders of the current bloom of Big Data technologies to reconsideration of their risks and perils, enterprises and IT operations are left to figure out how these developments might impact their business—and when. Hadley Reynolds, co-founder of the Cognitive Computing Consortium, presents an open reference framework jointly developed with Babson College’s technology management program. The framework gives executives and operating managers a tool to characterize the impact and behaviors of potential AI applications. Beyond impacts and behaviors, the framework integrates profiles of skills and resources required to effectively execute cognitive tasks.

Speaker:

Hadley Reynolds, Co-founder, Cognitive Computing Consortium

Track A: Building the Data-Driven Future

Moderator:

Dino Eliopulos, Managing Director, Earley Information Science

A201. The Rise of DataOps

Wednesday, May 23: 10:45 a.m. - 11:30 a.m.

DataOps is emerging as a methodology for data scientists, developers, and other data-focused professionals to enable an agile workflow that helps increase creativity and speed while also adhering to data governance requirements.

Making Big Data Bite-Size With DataOps

10:45 a.m. - 11:30 a.m.

Big Data, similar to relational data stores, experiences the challenges of data gravity, both the attraction and the weight, which cause friction and can be detrimental to development and testing teams’ ability to get the data they need to be successful. Hear how to move from simple DevOps with tools and automation, while making the data central to the solution and removing it as the pain point.

Speaker:

Kellyn Pot'Vin-Gorman, Technical Intelligence Manager, Delphix

A202. Modern Analytical Techniques

Wednesday, May 23: 11:45 a.m. - 12:30 p.m.

When looking to a data-driven future, consider modern analytical techniques to guide you.

Multiparadigm Data Science

11:45 a.m. - 12:30 p.m.

Multiparadigm Data Science is a new approach of using modern analytical techniques, automation, and human-data interfaces to arrive at better answers with flexibility and scale. Many organizations are still doing traditional data science -- confining themselves to problems that are answerable with traditional statistical methods -- rather than utilizing the broad range of interfaces and techniques available today. This session covers the basics of Multiparadigm Data Science, including automated machine learning, interactive notebooks and report generation, natural language queries of data for instant visualizations, and implementing neural networks with ease and efficiency.

Speaker:

Erez Kaminski, Wolfram Technology Specialist, Global Technical Services, Wolfram Research, Inc.

A203. Preparing for GDPR

Wednesday, May 23: 2:00 p.m. - 2:45 p.m.

Two days from now, on May 25, 2018, the EU’s General Data Protection Regulation (GDPR) becomes effective and bring a significant shift in how data is obtained, managed, processed, and disposed of.

GDPR: Writing Policy for Big Data

2:00 p.m. - 2:45 p.m.

Big Data and even bigger marketing have been the rallying cry of most organizations over the past decade. The premise has been that the more data we can gather, hold, and exploit, the more fiscal return on investment there would be—either through straight-up analytics, advertisement sales, or direct targeting. But what happens in a world where products and pricing are overcome with privacy and personal value? Podnar addresses the need to govern Big Data in the context of laws and regulations, especially GDPR, and what tactics you can effectively adopt to succeed in this new privacy and regulatory world.

Speaker:

Kristina Podnar, Digital Policy Consultant, NativeTrust Consulting LLC

A204. The New World of Blockchain

Wednesday, May 23: 3:00 p.m. - 3:45 p.m.

While some believe that blockchain is the most secure solution ever to come along, others think it is just a passing fad that puts essential data at risk. Get up-to-speed on the new world of blockchain.

Blockchain 101 With Oracle Cloud

3:00 p.m. - 3:45 p.m.

Blockchain is a distributed ledger technology that is most commonly known for enabling the cryptocurrencies and payment networks. Infosys and Oracle are leaders in enabling the infrastructure for blockchain-supported applications. Oracle Blockchain Cloud Service supports real-time secure transactions and regulatory compliance, reduces operations costs and fraud, and offers a pre-assembled platform.

Speakers:

Gurdeep Kalra, Principal Consultant, Infosys

Srikanth Challa, Senior Director, Blockchain, Infosys Ltd.

Wednesday, May 23

Track B: Digital Transformation

B201. Enabling the Real-Time Enterprise

Wednesday, May 23: 10:45 a.m. - 11:30 a.m.

The ability to glean real-time insights about customers, markets, and internal operations is critical today. Opportunities and risks abound and the ability to spot them faster can be the difference between success and failure.

Capital One Case Study: Building a Real-Time Fraud Engine

10:45 a.m. - 11:30 a.m.

Person-to-person payment (P2P) is a rapidly growing payment system within Capital One and all the other big banks in the U.S. Performing fraud analysis for each payment request is critical. This talk covers Capital One’s move from a micro-services-based fraud detection system to a new system that relies on stream processing (Apache Flink) and machine-learning to detect fraud.

Speaker:

Karun Komirishetty, Senior Manager, Software Engineering, Capital One

Enabling the Real-Time Enterprise - Data Lakes, Streaming and the Cloud

10:45 a.m. - 11:30 a.m.

To support a modern data architecture and approach to analytics, data integration strategies now support on-prem, cloud and hybrid deployments. Meanwhile, streaming architectures featuring change data capture (CDC) technology are rapidly being embraced to process data in motion. This session will discuss the new requirements and best practices to be successful in enabling a real-time enterprise, whether in a data lake, via streaming technology, or in the cloud.

Speaker:

Dan Potter, VP of Product Management & Marketing, Attunity

B202. The Trust Factor in the Analytics Era

Wednesday, May 23: 11:45 a.m. - 12:30 p.m.

Today, data is increasingly seen as the fuel of the business, rather than its byproduct. As a result, there is greater need to ensure data is of high quality.

Garbage in, Garbage Out: Why Data Quality Is the Lifeblood of Machine Learning

11:45 a.m. - 12:30 p.m.

The old adage “garbage in, garbage out” couldn’t ring truer when it comes to maximizing the value of machine learning in the enterprise. Machine learning is worthless if it’s fueled by bad data. This discussion helps attendees thread through the noise and understand exactly how to get the most out of machine learning by making their dirty data come clean. Learn more about the difference between machine learning, artificial intelligence, and deep learning; why collecting massive amounts of data simply isn’t enough to glean value from machine learning technology; what’s real and what’s hype when it comes to machine learning; and how to use machine learning to predict, identify patterns, and optimize processes.

Speaker:

Steve Zisk, Senior Product Marketing Manager, RedPoint Global

B203. Taking Advantage of Big Data Disruption

Wednesday, May 23: 2:00 p.m. - 2:45 p.m.

Big Data is challenging the status quo and spurring disruptive new technologies and services. Understanding the tools and technologies that are available, and the pros and cons of each is critical to making the right choices.

Capitalizing on Big Data Disruption: Hadoop, Machine Learning, IoT Analytics, & More

2:00 p.m. - 2:45 p.m.

As Big Data grows, there is the opportunity to explore and manage larger data volumes for business value. But with seemingly endless commercial, open source, and “as-a-service” offerings hitting the market each week, how do you choose the right mix of technologies and avoid creating an accidental architecture that will limit you from future innovation? How are organizations actually achieving true bottom-line benefits from their Big Data initiatives? This talk helps you understand how your organization can adopt an effective and agile approach to Big Data analytics while focusing on the analytical use cases that deliver a bottom-line and competitive impact.

Speaker:

Steve Sarsfield, Product Evangelist, Vertica Platform, Micro Focus

B204. Succeeding With Big Data in the Real World

Wednesday, May 23: 3:00 p.m. - 3:45 p.m.

Big Data has significant implications for industry and government. But it is not enough to collect large quantities of data and securely store it. Succeeding with Big Data in the real world requires planning and preparation.

The Role of Failure in Succeeding With Big Data

3:00 p.m. - 3:45 p.m.

In video games, players learn by failing—even if they have to “die” hundreds of times before learning how to succeed. By enabling us to simulate scenarios and predict outcomes, AI and Big Data have essentially made the world similar to a game that we can play with, yet we still expect immediate success. Is this realistic? In this presentation, technologist Weller explores the role of failure in machine learning using real-world examples.

Speaker:

Scott Weller, Co-Founder and CTO, SessionM

Wednesday, May 23

Track C: Cloud Day

Moderator:

Joe McKendrick, Principal Researcher, Unisphere Research

C201. The Journey to the Cloud

Wednesday, May 23: 10:45 a.m. - 11:30 a.m.

Many organizations have made the journey to the cloud or are about to embark on that path. How can you be sure the data is protected and stored with adherence to changing regulatory mandates? What do you move first? How do you avoid vendor lock-in?

Operationalizing Your Cloud Investments

10:45 a.m. - 11:30 a.m.

Today’s data-driven businesses are adopting Big Data to boost analytical capabilities and to minimize their dependency on obsolete legacy systems. While enterprises are stabilizing their on-premise Hadoop clusters, a majority of them are already targeting the next big thing: moving their storage and processing to cloud. This migration may sound attractive, but it comes with challenges. Of them, data security, regulatory, and compliance requirements are the most significant. Additional concerns include the uncertainty of costs, perceived loss of control, and vendor lock-in. Kamal covers the successful solution patterns used in overcoming the most prominent challenges.

Speakers:

Shahab Kamal, EVP, Customer Success & Solution Engineering, Bitwise

Mark Kubik, VP Data Analytics, Global Payments

Cloud Challenges

10:45 a.m. - 11:30 a.m.

The shift to Big Data platforms creates both opportunities and challenges. Storing three copies of your data doesn't protect against human error, corruption, or threats like Ransomware. Compliance with stringent new regulations presents additional challenges. Keys to success include accurate and rapid point-in-time recovery, efficient storage tiering, streamlined cloud migration, and rapid identification of external security threats.

Speaker:

George Folden, Director of System Engineering, ImanisData

C202. Cloud Success in the Real World

Wednesday, May 23: 11:45 a.m. - 12:30 p.m.

Cloud has become mainstream and today companies of all types are enjoying cloud success with solutions and services spanning the cloud stack.

Leveraging the Power of Cloud to Build & Operate Complex Data Platforms

11:45 a.m. - 12:30 p.m.

AEG Presents, the live-entertainment division of Los Angeles-based AEG and the largest producer of music festivals in North America, chose the Pythian Kick Analytics as a Service solution to power its sophisticated, data-driven marketing operations. By integrating data from multiple sources, including financial, sales and marketing, with Pythian’s help, AEG can now get a 360° view of its customer experience and be able to identify opportunities to promote events to clients based on past purchases in an intelligent and timely way. This presentation is for enterprise/data architects, data analysts, and digital marketing analysts who would like to learn how to leverage the power of cloud for building and operating complex data platforms.

Speaker:

Danil Zburivsky, Director of Engineering, Kick Analytics-as-a-Service, The Pythian Group

C203. Straight Talk on Cloud Licensing

Wednesday, May 23: 2:00 p.m. - 2:45 p.m.

Cloud makes many processes easier and others harder. Find out what you need to know about staying in compliance with software licensing in an increasingly hybrid world.

Straight Talk on the Software License Landscape in the Cloud

2:00 p.m. - 2:45 p.m.

As more and more organizations’ software infrastructure is being stretched between on-premise, public clouds, and hybrid clouds, maintaining compliance with software licensing is a more significant challenge than ever before. Sorting through all the FUD and getting straight answers from the vendors on the proper way to license software in this complicated world is not always easy. Some vendors have turned to software license audits as an easy way to generate additional revenue. This presentation discusses current software licensing trends in this cloud-fueled world, lessons learned, and steps every organization should take to stay in compliance.

Speakers:

Michael Corey, Co-Founder, LicenseFortress

Dean Bolton, Co-Founder, LicenseFortress

C204. Managing Big Data Analytics in the Cloud

Wednesday, May 23: 3:00 p.m. - 3:45 p.m.

Analytic processing––and the amount of data subject to analysis––is growing exponentially. Meanwhile, the amount of physical data center space isn’t keeping pace, and expansion of those environments is not becoming any more affordable. Find out where the cloud fits in.

Managing the Analytic Deluge in the Cloud

3:00 p.m. - 3:45 p.m.

The modern world of analytics abounds with cloud technologies that offer convenient, agile, and feature-rich approaches for running new analytic applications without requiring new investments in hardware, operating systems, or IT infrastructure. Jeschonek provides insights into how IT pros are leveraging the resources of the compute cloud to support assets running on premises without performance sacrifices or the need to rewrite existing file-based applications.

Speaker:

Scott Jeschonek, Principal Program Manager, Microsoft

Wednesday, May 23

Track CS: Cognitive Computing Summit

Moderator:

Christine (Chrissy) Geluk, Principal & Founder, Librarian At Your Service LLC

CS201. Ranking and Relevancy

Wednesday, May 23: 10:45 a.m. - 11:30 a.m.

Tuning search relevancy is a leading candidate to be replaced by machine learning.

Learning to Rank

10:45 a.m. - 11:30 a.m.

Learning to Rank (LTR) is now avialable for both Solr and Elasticsearch. Why is this such a hot topic? What does an organization need to leverage a Learning to Rank solution? Haubert explains the LTR pipeline in terms of what is available as an off-the-shelf solution and what isn't. She discusses the challenges faced when implementing LTR and some open research areas moving forward.

Speaker:

Elizabeth Haubert, Data Architect & Relevance Engineer, Open Source Connections LLC

CS202. Self-Learning Systems in Action

Wednesday, May 23: 11:45 a.m. - 12:30 p.m.

We still have lots to learn about the practical applications of machine learning, social media, and data science within our organizations.

Self-Learning Social Media Systems Through Machine Learning & Data Science

11:45 a.m. - 12:30 p.m.

Machine learning can be applied for sentiment analysis of unstructured data in the context of social media. For example, a large telecommunications organization leverages modern data science approaches to ease the daily business of communication experts and to give them features not available before. Learn how to use key capabilities of advanced text analytics such as language detection, genre identification, named entity extraction, key influencer or key opinion leader about a trend or brand lovers. Gain new ideas about predicting and determining potential social media crises before they happen.

Speakers:

Jana Mitkovska, Project Manager, Raytion

Christian Puzicha, Senior Solutions Architect, Raytion GmbH

CS203. Building a Cognitive Computing Platform

Wednesday, May 23: 2:00 p.m. - 2:45 p.m.

Cognitive computing platforms involve artificial intelligence, graph technology, and a host of other potential considerations.

Knowledge Graphs as a Pillar to AI

2:00 p.m. - 2:45 p.m.

Today’s knowledge workers want the findability and discoverability of Amazon’s recommendation engines and the intuitive usability of Google’s search rankings and knowledge graphs. These features have been organizational desires for years, but technology has finally caught up with the requirements backlog. Ivanov and Midkiff explain how ontologies serve as a framework for enterprise knowledge graphs, show how semantic layers complement traditional information models, and demonstrate how semantic knowledge models can be used as a basis for text mining. They use real-world production examples, including the Department of Veterans Affairs, the National Park Service, and the Harvard Business School.

Speakers:

Yanko Ivanov, Senior Knowledge Management Consultant, Enterprise Knowledge

James Midkiff, Semantic Web Technologist, Enterprise Knowledge

CS204. Getting Cognitive With Security

Wednesday, May 23: 3:00 p.m. - 3:45 p.m.

Security is a pervasive concern across a wide range of companies and industries. What do cognitive technologies bring to the table to ensure data is secure?

Cognitive Computing & the Future of Cybersecurity

3:00 p.m. - 3:45 p.m.

Businesses and data security leaders are constantly looking for ways to better anticipate and even predict threats before they happen. Two major challenges: Companies have a huge amount of data to process and very little time to do it. New forms of targeted attacks have evolved. These new threats require new thinking, and that’s where the latest cognitive capabilities can help.

Wednesday, May 23

Keynotes

Closing Keynote: What Does the Data Tell Us?

Wednesday, May 23: 4:00 p.m. - 5:00 p.m.

For the last 2 days, we've heard about exciting developments with technologies focused on the business uses for data, including machine learning, cloud computing, Hadoop, and, of course, cognitive computing. What we do with these technologies to advance business opportunities and avoid business risks is now up to each individual attendee. What are the takeaways that impact us?

Speaker:

Joe Caserta, Founding President, Caserta

Conference Program

Monday, May 21

Preconference Workshops - Morning

W1. Introduction to Hadoop

W2. Introduction to Data Science

Monday, May 21

Preconference Workshops - Afternoon

W3. Introduction to Cognitive Computing

W4. Introduction to Machine Learning

W5. Introduction to Blockchain

Tuesday, May 22

Keynotes

Welcome & Keynote - Once We Know Everything

Sponsored Keynote - Oracle

Tuesday, May 22

Track A: Moving to a Modern Data Architecture

A101. Modern Data Architecture Design

A102. Making the Shift From Relational to NoSQL

A103. IoT Data Strategies

A104. Overcoming Big Data Integration Challenges

A105. The Rise of Graphs & Enterprise Data Fabric

Tuesday, May 22

Track B: Competing on Analytics

B101. Becoming a Data-Driven Enterprise

B102. A 360° Customer View

B103. Analytics in Action

B104. Beating the Barriers to Better Analytics

B105. The Future of Analytics

Tuesday, May 22

Track H: Hadoop Day

H101. The Big Data Ecosystem Today

H102. Data Lake Best Practices

H103. The Hybrid Future of Big Data

H104. Unleashing Your Big Data With Spark

H105. The Challenges & Tactics for Improving Analytic Performance

Tuesday, May 22

Track CS: Cognitive Computing Summit

CS101. The Rise of AI

CS102. The Path to Cognitive Search

CS103. Machine Learning in the Real World

CS104. From Zero To Hero With TensorFlow

CS105. PANEL: Beyond “Do No Harm”— The Complex Considerations of Ethical AI

Wednesday, May 23

Keynotes

Keynote - Capturing Value With Cognitive Computing: Executing on Data Challenges

Sponsored Keynote - IBM

Sponsored Keynote - Bedrock Data

Wednesday, May 23

Track A: Building the Data-Driven Future

A201. The Rise of DataOps

A202. Modern Analytical Techniques

A203. Preparing for GDPR

A204. The New World of Blockchain

Wednesday, May 23

Track B: Digital Transformation

B201. Enabling the Real-Time Enterprise

B202. The Trust Factor in the Analytics Era

B203. Taking Advantage of Big Data Disruption

B204. Succeeding With Big Data in the Real World

Wednesday, May 23

Track C: Cloud Day

C201. The Journey to the Cloud

C202. Cloud Success in the Real World

C203. Straight Talk on Cloud Licensing

C204. Managing Big Data Analytics in the Cloud

Wednesday, May 23

Track CS: Cognitive Computing Summit

CS201. Ranking and Relevancy

CS202. Self-Learning Systems in Action

CS203. Building a Cognitive Computing Platform

CS204. Getting Cognitive With Security

Wednesday, May 23

Keynotes

Closing Keynote: What Does the Data Tell Us?

Don’t Miss These Special Events

BROUGHT TO YOU BY

Diamond Sponsors

Platinum Sponsors

Gold Sponsors

Association Partners