Imply Continues to Alleviate Analytics Application Pain Points with Project Shapeshift’s Second Milestone

Sep 20, 2022

By Sydney Blanchard

Imply, founded by the original innovators behind Apache Druid, is announcing the achievement of its second milestone for Project Shapeshift, an approach towards targeting and resolving analytics application build issues for their developers. Aiming to be the database of choice for any developer building applications to support digital businesses, Imply’s business prerogative highlights a critical intersection between application functions and big data analytics.

“Developers, because of their own digital initiatives—driven by the advent of streaming data, the use case of Cloud services applications, IoT—reveal these macro-level trends that are occurring, where every company is trying to become digital, or those that are already digital, are trying to become more competitive; they need to build new analytics use cases,” said David Wang, VP of product marketing at Imply. “That new analytics use case is called analytics application, and effectively, it’s marrying common traits brought by both the analytics world and the app world.”

Project Shapeshift was unveiled at Druid Summit 2021; marking Imply’s dedication towards an analytics-meets-application database. The milestone recently announced consists of enhancements towards Druid capability, Total Value Guarantee, and Imply Polaris cloud services.

Developing upon Druid’s multi-stage query engine first debuted privately in March, its latest iteration within Apache Druid 24.0 improves ingestion speed, data quality, and integration features. Batch ingestion is now up to 65% faster in Druid, using common SQL queries that will continue to support new extensions to the database—presenting a critical opportunity to save time and financial resources. With SQL-based in-database transformation, Druid significantly enhances data quality by enabling data enhancement, data enrichment, easy experimentation with aggregates, approximations, such as hyperloglogs and theta sketches. Further, Druid is now equipped for integration with a range of open-source and commercial data tools, citing Informatica, FiveTran, Monte Carlo, Bigeye, and dbt as few examples.

Apache Druid, first introduced in 2012, was built to fill a gap seen in the database types offered at the time. After its release as an open-source solution, its creators noticed a trend; there was, in fact, a large audience of organizations whose developers were finding massive value with Apache Druid. Companies like Netflix, Salesforce, Target, and other enterprises saw the value in its unique approach that rivaled other analytics databases—that approach being a leverage of the existing meshing of application capabilities and analytics

“We always thought of Druid as a shapeshifter when we originally built it to support analytics apps of any scale,” said Gian Merlino, CTO and co-founder of Imply and PMC chair for Apache Druid. “Now we’re excited to show the world just how nimble it can be with the addition of multi-stage queries and SQL-based ingestion.”

Imply highlights its resource-saving abilities, stating that developers can save time from managing the database and save money on infrastructure—ultimately reducing the total cost of ownership (TCO). Introducing Total Value Guarantee expands upon the value of its solution, providing a global guarantee for qualified Apache Druid users, stating that the TCO with Imply will be less than the TCO when running Apache Druid independently. Essentially, Imply is providing a method for Apache Druid users to ascertain partnership benefits—effectively for free, according to the vendor.

This milestone also introduces improvements to Imply Polaris, a database-as-a-service built from Apache Druid. Cloud integration with Druid, according to the vendor, optimizes data operations and delivers an end-to-end service from stream ingestion to data visualization. The latest updates feature schemaless ingestion for nested columns support, as well as faster subsecond approximate queries for DataSketches. Visualization of data within Polaris sees ease enhancements, as well as additional performance and monitoring alerts to identify low latency queries and increased security via resource-based access control (RBAC) and row-level security. Finally, new node types for Polaris accommodates price and performance requirements, as well as added comprehensive consumption and billing metrics features for usage insights.

“Companies built in open source have that challenge of having to think, ‘do we innovate, or do we prioritize proprietary code at the underlying technology?’,” said Wang. “For us, we’re committed to open source, but we can also build a thriving business at the same time. We contribute code and allow the Apache Druid project and technology to flourish and become more capable and easier to use, but at the same time, we’ve built a business that is really rooted in the Imply developer experience.”

Imply plans continuing to innovate the core technology for Apache Druid, while simultaneously improving the developer experience. User experience is at the forefront for Imply; future iterations of Apache Druid will continue to adhere to these values, according to the company.

Apache Druid 24.0 will be publicly available for download on druid.apache.org. Imply Polaris features are readily available.

To learn more about this milestone, please visit https://imply.io/.