In a keynote at Data Summit 2019 titled “Digital Transformation Is Business Transformation: How to Incorporate AI Technology Into a 130-Year-Old Company” Helena Deus, technology research director, Elsevier, showcased how the company combines content and data with analytics and technology to help researchers to make new discoveries and have more impact on society, and clinicians to treat patients better and save more lives.
Deus was presenting the work by Dr. Michelle Gregory, SVP Data Science, Elsevier.
Elsevier is part of a larger group called RELX Group, and has three sisters: Reed Exhibitions (Comicon); LexisNexis, legal data analysis, and RBI. As a whole, its revenue was $7.9 billion in 2018.
Elsevier, 130-year-old, global company, has published mainly books and journals for many years, but when the world became digital it recognized that it had to make that transition as well. The move to ebooks and digital papers created the need for search platforms to help people discover content, said Deus. There are more papers than anyone can read and what customers need is decision-making tools that help them understand the needs and questions and gives them the right answers—answers that they can trust with evidence that they can actually trust it.
As it moved into “doing,” answering customer’ questions instead of making them read the papers, Deus emphasized that the company found that machine learning surpassed any other approach they used to solve the “do” problem.
Use Case for AI for Scalability and Cost Savings
For a traditional B2B company, the transition to data and analytics as a service is not always straightforward, said Deus, who showcased two use cases at Elsevier.
Optimizing for discovery ensures user engagement and drives business value: Reaxys is a product for chemists, helps them find patented or published molecules and the reactions to build them. Elsevier is using machine learning to address the task of identifying relevant documents and whether they are useful. Using machine learning to address this task, Elsevier could look anywhere in the text, not just the acknowledgements. A simple support vector machine classified snippets as useful/not useful. The company then used an ensemble or committee method to let the multiple annotators vote on the funder each of them extracted. With this technique, recall went way up, precision stayed the same, so in the end the F1 score was 83%
It is very important for funders to know the outcomes of their money and for researchers themselves to know who can fund their research. Most scientific articles—but not all—mention the funding information in the acknowledgements section but there is no one single way to acknowledge a funder. The task was to extract the funding body, the grant ID if any and the relation between them. Elsevier used a taxonomy, fundRef—to look for funder names. The process included indexing articles with Scopus, locating funding snippets, extracting funding bodies, grants ids, relations between them, and linking funding bodies to a standardized taxonomy of funders (FundRef).
Guiding principles and lessons learned, according to Deus:
You need close collaboration with product teams
- Product KPIs are based on OKRs
- Work in the squad model
- No f-score beyond the customer perspective
You need close collaboration with technology teams
- Work hand-in-hand on algorithm development and production environment (comp requirements, etc.)
- Production environment and development tools constrain data science tools
- All data science services should be able to run on any platform defined by the architecture review board (flexible, retrainable, etc.)
Stay close to your production systems
- Use these techniques to improve customer outcomes, and let the customer needs drive what you need to do
- Stay close to production data when operationalizing—training data is a subset of actual data
Engage with the AI community
- Have horizontals in the group clustered around capabilities, let them publish/attend conferences to attract talent/learn best practices
- Use a scrum of scrums for shared learning; put junior and senior people in the same team for knowledge transfer
- Write small APIs and reuse them across teams and feature requirements
Conclusion: 5 principles of success for integrating AI into business practices
Support from the top: You need support from the top, without support you are not going to go far.
- Shared responsibilities (OKR’s)
- Tough decisions
- Resources
Change management
- Challenge mind set
- Retrain staff
- Shift skill sets
Organized for success: Organization are decentralized and centralized, but make sure the right people are talking
- Different models depending on size, maturity (not one-size-fits-all)
- Centralized
- Decentralized
Expertise: Hire if you can, but if not, train people
- Hiring/retaining right skills
- Collaboration/teamwork
- Passion / focus
Demonstrable impact: Pick the low-hanging fruit and build from there. Identify what you can solve today or this week. Start small and build your way from there.
- Pick low hanging fruit that impacts customers
- Pick projects that impact efficiency
- Demonstrate effectiveness before suggesting new enterprise systems
Many presenters, including Deus and Gregory, have made their slide decks available on the Data Summit 2019 website at www.dbta.com/DataSummit/2019/Presentations.aspx.