Artificial Intelligence Has a 1% Problem

Aug 14, 2017

By Joyce Wells

There are plenty of pronouncements about artificial intelligence—both in terms of the miracles it can produce and the threat it poses to humanity. But Ali Ghodsi, co-founder and CEO of Databricks, a company that emerged from the UC Berkeley AMPLab with a focus on Apache Spark, is asking everyone to exercise a little restraint.

According to Ghodsi, 73% of the use cases that Databricks supports for its 500-plus customers are AI-based. But many of these organizations are working hard to put AI to use. At the same time, however, hyperbolic stories abound about how AI is taking over and how successful, and possibly dangerous, it is.

“It seems like there is such an extreme contrast between the reality we are seeing when we are talking to Fortune 2000 companies that are trying to get their predictive problems solved and what the press is talking about,” said Ghodsi.

AI’s 1% Problem

Everyone wants to be involved in AI and recognizes its great potential, and if you ask CIOs and C-level executives, how they intend to stay competitive, chances are that among their list of top five issues is the use of artificial intelligence, he noted. The myth is that AI is everywhere, but the reality is very different, said Ghodsi.

Instead, there is actually a “1% problem” in that there are a handful of companies such as Google, Amazon, and a few others that are actually accomplishing their goals with it. These tech leaders have tens of thousands of Silicon Valley engineers, many of whom hold PhDs, or are top professors that have been recruited from leading universities such as MIT, Stanford, and UC Berkeley, and they are focused on a few narrow problems like self-driving cars or getting people to click on more ads. They have been fairly successful in the limited scope of what they are trying to achieve, but the rest of the companies, “the 99%,” don’t have these resources and are finding that there is actually great complexity surrounding the problems they are tackling, Ghodsi explained.

Early Applications

Two leading examples of sectors where Databricks is seeing early use of AI are industrial IT and healthcare. Companies are collecting data from sensors and feeding it into Databricks to do predictions on the likelihood of finding oil in one location versus another, so they can be more efficient and improve their environmental impact. And in healthcare, organizations are trying to use AI to help identify cancer tumors in imaging. But in each of those areas, what is needed is not only data scientists and data engineers but also subject matter experts who usually are not familiar with artificial intelligence, database systems, or data warehouses, noted Ghodsi.

In the case of using AI to identify tumors, the application is not even close to being fully automated. Google with its legions of PhDs can develop technology to tell cats from dogs, “and it is almost funny if they get it wrong—it’s a cat but looks like a dog,” said Ghodsi. But in the healthcare field, if someone says it is a cancer tumor but they are wrong, it can have life-altering results.

What Needs to Happen for Greater Use of AI

According to Ghodsi, three things need to be addressed to enable more widespread use of AI among more companies.

Skills gap – First, domain experts are needed. For example, Ghodsi said, to identify cancers, you need doctors, as well as data scientists who understand the machine learning software to build predictive models, and then you also need data engineers who understand databases and data warehouses, and where to store data and can deal with the massive variety and velocity and other pieces of the big data problem. It is challenging to get all those different personas to work together in what can be a politically charged atmosphere with concerns about control and authority.
Disparate tools – Right now there are too many tools that need to be stitched together and many of them are open source, said Ghodsi. The tools span data cleaning, ingestion, security, predictions, and monitoring. To enable all these open source tools to work together today companies have to hire developers to make the software interoperate successfully—and that is just to get the software to work together, said Ghodsi.
Infrastructure – The third challenge that Ghodsi sees is just running the infrastructure involving machines and clusters and making sure it is all secure and that the data is flowing in a governed manner, since a breach or leak can have dire consequences for organizations in heavily regulated industries.

Cloud is Critical

According to Ghodsi, those are the three top issues that need to be dealt with early on to embark on an AI journey. To address them, what’s needed are tools that enable collaboration, integrated systems, and greater consistency and availability of infrastructure enabled by the cloud.

Collaborative tools – The best way to address the skills gap is to have tools that enable collaboration across multiple personas. That will be key, said Ghodsi.
Integrated technology – Technology needs to be more integrated and work together better so that the problem of getting data cleaned, turning it into data that can be used in a mathematical format for predictions, adding in more data, and building predictive models and put the software into production in an automated fashion. The ability to add more data to the original datasets to enrich them can be the difference between bad predictions and good predictions, said Ghodsi.
Cloud – To make sure that the infrastructure is always up and running, cloud is the way to go, Ghodsi noted. Reliance on cloud enables automation, outsourcing, and the ability to ensure that the infrastructure is always up and running. It just so happens that the 1% of companies that have been successful with AI are all cloud companies. That may be coincidence or causation, but Google, Facebook, and AWS—and other companies that are truly successful with AI—are cloud companies, he said. “We think that is a big differentiator."

What AI Is and What It Is Not

There is still a long road ahead before AI becomes widely used and the most advanced work that is being done with AI is not trying to replace the human brain as is often feared; it is really augmenting it and helping humans to do challenging tasks better, contends Ghodsi.

Google has an immense memory of all the websites around the word and whether that is AI or simply a database with a lot of entries can be considered a philosophical question, he said. However, there can be no doubt that it has enabled humans to be able to find information very efficiently since no one could ever store all that data on their own and then identify the source of information that is needed at the time. And, if you have a map and need to find the fastest route between two points, software can do that well. However, AI is making little progress in anything that requires creativity and is not super-structured, according to Ghodsi.

Ghodsi pointed to the competition recently in which Google’s AlphaGo was able to defeat human Go champions because it can do simulations faster than the human brain. But, he said, if you asked AlphaGo for reflections on its victory, and asked it to identify the turning point in the game where it believed it had clinched the victory, it would not be able to do so. On the other hand, a human could go on for hours talking about the twists and turns of their decision making. AlphaGo had simply been running through a series of programmed algorithms that simulate scenarios and was picking the best one. Is it really replacing a human being? No, Ghodsi said.

The Bottom Line

“I don’t think AI at its core is bad for humanity,” said Ghodsi, pointing out that it is not decreasing the amount of resources on the planet, or the food, education, and healthcare available to people.

But before any of the sought-after results can be achieved on a widespread level, he believes, the core three problems of the skills gap, disparate tools, and infrastructure will need to be addressed.

The truth is that answering questions such as which genes are responsible for certain diseases is difficult but everyone wants to say they are having “fabulous results,” said Ghodsi. “Nobody wants to say 'this is actually pretty hard and challenging.'”

The other issue, he noted, is that all the hype about AI is in danger of overshadowing its huge potential by giving the impression that AI poses an imminent danger that it does not.

Databricks's goal is to simplify as much as possible and to make technology available for the 99% by democratizing the process and helping companies that are not even close to to being as far along as the "Googles" of the world to take advantage of AI, said Ghodsi.