MapR Technologies, which provides a data platform for AI and analytics, is introducing new capabilities aimed at speeding the operational impact of automated analytics, improving the productivity of developers and data scientists, lowering TCO, and streamlining security and storage across data center, cloud, and edge deployments.
The newest release of the MapR Data Platform is entering customer beta test and will be generally available in Q3 2018.
MapR’s newest innovations are focused on enabling data scientists and developers to leverage all data, said Anoop Dawar, senior vice president product management and marketing, MapR. Built in close collaboration with leading customers, the new release supports and enables distributed AI and analytics spanning "multi-temperature," multi-protocol, on-premises, edge and cloud deployments, he noted.
Additions to the platform include enhancement of the data fabric to cloud storage through object tiering; fast ingest erasure coding for more cost-effective, long-term data retention; security innovations to automatically enable security across the environment; and a new S3 API supporting next-gen applications and increasing application portability.
Big data has changed in the way it is collected, stored, processed, and analyzed—spanning Hadoop MapReduce, cloud—and increasingly multi-cloud—Spark, streaming analytics, the use of AI, the growing importance of edge, and the expanding use of containerization, said Dawar.
Against rapidly changing data solutions scenario, it is clear that organizations will need to use the data of yesterday, today, and tomorrow with the tools of yesterday, today, and tomorrow, said Dawar. MapR’s goal, he said, is to provide a platform that survives and thrives amidst the transitions in technology and deployment going on now and in the future. That is the crux of the approach that forms the foundation of a modern data platform for AI, analytics, IoT, and edge, he added.
Core data services innovations to speed AI and analytics and lower TCO include end-to-end, policy-driven automatic data placement across performance-optimized, capacity-optimized and cost-optimized tiers, on-prem or in cloud, with object tiering for effectively storing data that may be frequently, infrequently, or rarely accessed; fast ingest erasure coding that can be used for capacity-optimized tiers or with high speed SSDs for an optimized analytics tier; and secure file-based services to ensure corporate security compliance with NFSv4.
The native S3 interface for next-generation applications, enabling direct analytics on operational data and transparent application portability across on-premise and multi-cloud environments, is notable said Dawar, because S3 has become the defacto standard for apps and to analyze the data in-place. In addition, S3 is gaining relevance as the interface for analytical processing along with HDFS.
The new release also provides simplified development and deployment of AI and analytic applications, including high performance, continuous processing with Spark 2.3 for structured streaming and machine learning; analytics toolkit support with Hive 2.3 that has over 800 JIRAs resolved; non-programmer enablement to create streaming applications with KSQL; and simplified streaming analytics application development with KStreams.
If AI is "the new electricity," then the data fabric that spans edge to cloud to the data center, and gives organizations the mechanism to deal with all of their data, is "the new grid," said Dawar.
Providing streamlined security and critical data asset protection, the updated MapR platform offers volume-based data encryption at rest as an additional means to prevent unauthorized access to sensitive data. Encryption is also used to avoid exposure to breaches such as packet sniffing and theft of storage devices. MapR’s philosophy of “Secure by Default” enables data platform security out-of-the-box including core and ecosystem services for new installations with a single click. All data can be stored encrypted and all network connections are encrypted with authentication enabled.
With these enhancements, security analysts can audit events and usage in the same way streaming analysis allows for real-time BI, said Dawar. Security analysts can look at who touched customer records outside of business hours, what actions a user took before leaving the company, what operations were performed without change control, which users are accessing sensitive files from protected secured source IPs, or why do some report look different despite being sourced from the same underlying data.
And, this information—such as who accesses specific files, and how often files are touched—can also be used to inform policies for efficient storage tiering, Dawar said.
The newest release of the MapR Data Platform is entering customer beta test and will be generally available in Q3 2018.
For more information, visit mapr.com.