"New platforms, such as Hadoop, are rapidly introducing new models of data-parallel computation, such as MapReduce, to environments traditionally focused on client-based query of SQL stores," says ScaleOut's Bain. This enables massive datasets to be analyzed without creating bottlenecks in disk and network I/O while transparently harnessing the power of large computational clusters. Warehousing environments which can efficiently re-stage data in NoSQL stores to support the new, data-parallel, analytics models will benefit the most from the power of big data.
Ultimately, the key is to share, share, share, Probstein continues. “It's critical to get your whole company on the same infrastructure. You can't maximize your ROI if one set of analytics works for one part of the company, but has to be rethought and re-implemented for another. Construct a shared system that allows various consumers around the company to apply subject-matter expertise and unique analytics, without worrying about linking, normalization and all the other things that delay insight.”
Once a converged architecture is in place, data integration will naturally follow, says Probstein. “Leave the legacy systems like the enterprise data warehouse in place, and create new back-ends that facilitate and can easily accommodate new sources—particularly types the legacy/relational stack can't handle, like documents, email, web content, free-flowing text, SharePoint, call logs, and surveys.”
The Rise of Big Data Focuses Attention on Enteprise Architecture
Whether it’s based on relational or next-generation technologies, the rise of big data is creating more interest among business and IT leaders in enterprise architecture. Many companies have initiatives underway to introduce enterprise architectural concepts to their data integration efforts. For example, the IBM survey showed that data service architecture, which incorporates service-oriented architecture, “is currently quite popular and is estimated to grow in use by nearly double, up to 73%,” Kopp-Hensley says.
Taking a more high-level view of systems planning means more emphasis on establishing repeatable methodologies that address many of the issues around data integration, especially as data sources are added to corporate environments. Enterprise architecture for data integration also puts generally accepted processes in place for data sourcing, scalability, quality, and accessibility.
Leveraging Old and New Solutions in a Common Framework Yields Results
“We are seeing a shift from standalone projects to a more strategic architectural evolution in big data platforms,” says John Haddad, senior director of product marketing for big data at Informatica. He adds that the rise of enterprise architecture doesn’t mean replacing established solutions—rather, it’s a way to bring older and newer solutions into a common framework. “Architectural strategies are evolving—not necessarily emerging,” he says. “They are evolving to include new technologies that complement strategies, and do not necessarily replace them.”
The best approach to a big data integration architecture is to “start small, and pick well-defined and manageable use cases,” Haddad advises. “Prove the value and grow from there.” He also urges organizations to “leverage current infrastructure skills as much as possible.”
Ultimately, it’s the end results that matter, and IBM’s Kopp-Hensley sees a number of benefits coming from high-level efforts to integrate relational database management systems with big data platforms. But it takes a couple of steps to get there. “The first is to modernize the architecture such that the enterprise can leverage big data with performance and speed of delivery,” she says. “Second, provide the business with more advanced insight, more predictive capability, social media analysis, and machine and text analysis. In other words, provide an advanced analytic architecture that allows the ability to leverage all types of analytics against all of data.”