Along with lighter, faster-to deploy databases, a key data environment making its way into the mainstream for processing unstructured data workloads is the Hadoop framework. Unisphere Research’s “Big Data, Big Challenges” survey confirms that while many enterprises are in the early stages of adoption, Hadoop will increasingly make inroads into the enterprise, with usage of Hadoop/MapReduce more than doubling in the next year from 7% to 16%.
Hadoop provides enterprises with a way to integrate a variety of data types, within any scale, into a single environment from which it can be consumed. This is key for developing single views of information already being generated by various systems across the enterprises, from financial systems to supply chains to human resource systems, along with new data sources such as machine-generated and user-generated data. Hadoop facilitates the ability to ingest both unstructured and structured data and make it available as information for the enterprise. Such tools as Hadoop and new types of databases enable enterprises to obtain a finer-grained view of any and all data coming through their systems. Hadoop is being applied for a wide range of data processing and analytics tasks, including mobile data support, IT security, online advertising, image processing, and online traveling booking.
For more articles on this topic, access the DBTA Best Practices section on Enterprise-Ready NoSQL, NewSQL and Hadoop - What's Ahead for Big Data.
Hadoop will process a wide variety of data types and then make it available to any and all enterprise applications. The advantage Hadoop provides is being able to incorporate big data and make it available to existing environments, such as relational databases dependent on extract, transform and load (ETL) processes make it an easy fit into the enterprise.
At the root of these initiatives around nonrelational databases and platforms is the quest to compete on analytics. Sixty-two percent of the respondents to the “Big Data Opportunities” survey, in fact, cite predictive analytics as leading use case for big data. Effectively deploying data analytics requires the ability to tap into, integrate, and explore the vast array of unstructured data coming into enterprises.
Much of the data now coming in, due to cost or structure, may be better suited for storage within nonrelational environments as the first option. The bottom line is that most enterprises are accumulating vast stores of both structured and unstructured data, which ultimately need to be integrated. The key challenge for many data managers is to be able to move away from point-to-point integration, which is not sustainable within big data environments and is often welded to specific applications, to build a well-architected data environment that can readily ingest and analyze massive and varied data sets.
This also requires investment in new skill sets beyond the relational model, as well as data managers and professionals providing leadership to help their organizations understand what types of technologies and platforms should be invested in, and what parts of the enterprise need to be modernized, versus replaced. In the “Big Data is Real and It Is Here” survey, lack of available skills to manage NoSQL or nonrelational databases led all other challenges, cited by 33% of respondents. There is a significant skills gap arising as organizations seek managers and professionals who understand and can leverage various unstructured data environments.
Ultimately, enterprises will be developing strategies and architectures that accommodate existing relational database approaches, along with the latest platforms and technologies. The key is to foster an environment of coexistence between relational database systems and unstructured data environments.