Seeking to help standardize and simplify Hadoop adoption, Pivotal made three key announcements. First, the company is open sourcing components of its Big Data Suite, including Pivotal HD, HAWQ, Greenplum Database, and GemFire. By sharing the capabilities of Pivotal HD, an enterprise Hadoop distribution, HAWQ, a parallel SQL query engine, the Greenplum analytics database, and GemFire, a distributed in-memory platform, with the open source community, Pivotal says it is providing the necessary components to build solutions that work for all data needs.
Second, answering a need for greater unification and predictability in the big data space, Pivotal and other big data industry leaders have announced plans for the creation of the Open Data Platform (ODP). The new industry association is being founded with Platinum members including GE, Hortonworks, IBM, Infosys, Pivotal, SAS, and Gold members EMC, Verizon Enterprise Solutions, and VMware. The companies say they will work together to drive collaboration, innovation and standardization across big data technologies by optimizing testing among and across the ecosystem’s vendors and accelerating the ability of enterprises to build or implement data-driven applications.
And third,Pivotal and Platinum ODP member Hortonworks also announced a joint effort to simplify adoption of Hadoop through a strategic and commercial partnership that aligns both companies around a consistent core of Apache Hadoop-based capabilities including product integrations, joint engineering and production support.
Pivotal Big Data Suite: Single SKU and Easy To Consume
Pivotal Big Data Suite will be a suite of big data products that are completely open-sourced. The goal is to help remove barriers to big data deployments through a single bundle, for both application developers and data practitioners, give access to flexible deployment of a data lake, tools for advanced analytics and data science, as well as a portfolio of building blocks for supporting custom data-centric scale-out applications.
In announcing the new open source initiative, Michael Cucchi, senior director, Data Product Group, Pivotal, observed that Hadoop is maturing and having a significant impact, in essence, “becoming the cloud database” for the future.
Concurrently, there is also a strong and growing acceptance of the open source approach which is becoming an enterprise standard. As large service providers, telcos, and infrastructure providers look to re-platform using next-gen analytics and big data solutions, they are insisting on open source components to avoid the lock-in and inflexibility that they had in the past with proprietary solutions. “We are seeing open source come up as a requirement on RFPs,” Cucchi noted.
But along with the rise of Hadoop and open source overall, he added, “There is skills gap between enterprise software and enterprise stacks and where the pure-play Hadoop market is today.” SQL will remain the most valuable workload on Hadoop and most valuable interface into data, he said, because there are probably only about 2,000 people in the world that are MapReduce experts and yet there are millions of SQL experts.
Pivotal had $100 million in big data software licensing in 2014 alone and more than $40 million in the first year for the Big Data Suite subscription offering which was launched in April 2014. By offering Big Data Suite by subscription, the company simplified the portfolio with flex licensing that provided access to all components. “Customers saw the value of connecting our data products and analytics products with our platform as a service and they saw the value of using the data and the analytics insights in applications.” Open sourcing the Big Data Suite products will build on that success and help expand the ecosystem.
Delivered On an Open Cloud Platform
According to the Pivotal, the focus among customers has shifted from storing the data to actually leveraging big data’s potential and this evolution demands a more agile data management approach. Pivotal Big Data Suite offers customers an agile data solution based on open source software that can be flexibly deployed using cloud technologies. And, Pivotal Big Data Suite will provide support for bare metal commodity hardware, appliance-based delivery, virtualized instances, and now public, private, and hybrid cloud support. In addition, Pivotal Cloud Foundry will be included, providing customers the ability to leverage Big Data Suite capabilities in Pivotal Cloud Foundry applications.
“We have a really strong heritage with CloudFoundry and all of our Spring technologies. We are open source through and through and with this move we complete it and take all of our data products to the open source community,” said Cucchi. When CloudFoundry was brought to the open source community right around Pivotal’s inception, the community was bifurcated and complicated, and it was hard for an enterprise to make a platform as a service decision, said Cucchi. “We built the CloudFoundry Foundation starting with seven major industry players and enterprise customers, and today it is has grown to 45 sponsors strong, and they are all the stewards of an open source standard that is enterprise class platform as a service. We are going to do the same thing now, and we are excited to join up with the Open Data Platform, which is effectively an initiative looking to do the exact same thing - but for Hadoop.”
As an initiative, the Open Data Platform will “ride above Apache” and be “totally complementary to Apache,” emphasized Cucchi. He recalled the emergence of the Linux kernel which allowed the Linux community to standardize on a set of core packages that were predictable and standard and allowed companies to build distributions around them. “We are not calling it a kernel; we are calling it a Hadoop core, but that is our goal - we are trying to define and standardize on a set of Apache packages that we can confidently and stably build a really strong, thriving ecosystem on top of.”