Platform Computing, a provider of cluster, grid and cloud management software, has announced support for the Apache Hadoop MapReduce programming model to bring enterprise-class distributed computing to business analytics applications that process big data.
As data stores have expanded, the need for platforms that can support distributed environments with high reliability, availability, scale and manageability to perform business analytics in a timely manner has increased. Platform's core distributed workload engines, found in Platform LSF and Platform Symphony, lend themselves to handling "big" data because they provide the support necessary to access, process and analyze multiple data types efficiently and quickly at large volume and to enterprise-class standards, the company says.
Platform Computing supports a wide range of customers, many of which are embarking on MapReduce initiatives as a way of managing the extreme growth in unstructured data, Scott Campbell, director, product management, Enterprise Analytics at Platform Computing, tells 5 Minute Briefing. "Of these, most have started testing with the Hadoop MapReduce architecture as a way to understand the technology and its capabilities. These are the same customers who are highly interested in Platform Computing extending our expertise in high performance computing and sophisticated workload management into the enterprise. It is a natural progression of the Platform Computing technology for existing customers."
The typical environments Platform Computing currently supports are more application-centric versus data-centric, Campbell explains. "However, with that said, there are many customers that are using Platform Computing software to manage tens to hundreds of terabytes of data today. With MapReduce, the workload is much more data-centric and the data sets tend to be comprised mostly of unstructured data. In this environment, we are seeing significant increases in the data set sizes as unstructured data continues to grow very rapidly. We are seeing analysis against hundreds of terabytes and even petabytes of data."
By extending enterprise-class capabilities to MapReduce distributed workloads, customers benefit from the ability to scale to thousands of commodity server cores for shared applications, resulting in the ability to perform at very high execution rates, offer IT manageability and monitoring while controlling workload policies for multiple lines of business users and applications and obtain built-in, high availability services that ensure quality of service.
Platform Computing offers a distributed analytics platform that is fully compatible with the Apache Hadoop MapReduce programming model, allowing current MapReduce applications to easily move to Platform's distributed computing workload platform while also supporting multiple distributed file systems. Platform Computing's solution also provides enterprise-class capabilities to deliver scaled-out MapReduce workload distribution. Designed to support more than 1,000 simultaneous applications, organizations can increase server utilization for up to 40,000 cores across all resources resulting in a high return on investment, the company says.
"MapReduce has grown up as a way to effectively manage unstructured data within scale-out, distributed file system architectures. This is where the majority of the efforts are targeted by our customers today," Campbell notes. "However, the logic is not limited to unstructured data and Platform Computing also has the ability to extend the reach into data warehouse technologies. One of the key differentiators to the Platform Computing technology for MapReduce is the ability to manage different data sources and output targets. This not only means support for multiple distributed file system architectures, but also for MPP type of database architectures."
For complete details about Platform Computing's MapReduce solutions, visit www.platform.com/mapreduce.