IT and data managers can only see a part of the data now moving through their organizations, and they’re struggling to get a better view.
Today’s database landscape is increasingly specialized and best of breed, due to the expanding range of new varieties of databases and platforms—led by NoSQL, NewSQL, and Hadoop. This is complicating the already difficult job of bringing all these data types together into a well-integrated, well-architected environment. Can it be done? Or perhaps the more appropriate question to ask is: Should it be done?
While there have always been many database choices, it’s only recently that enterprises have been embarking on new journeys with their data strategies. “We have been in a very innovative period,” says Claudia Imhoff, business intelligence consultant and founder of the Boulder BI Brain Trust. “If you look back over the last three to four years, we’ve had a lot of innovations in database design. We’ve had the three major players—Oracle, IBM, and Microsoft. Now there’s HANA in the mix, along with Hadoop and MapReduce, columnar databases, NoSQL and NewSQL. It’s been disruptive to the status quo.”
‘Database Wars’ Redux?
To Tony Fisher, vice president of data collaboration and integration at Progress, it all brings back memories of the industry’s “database wars” of the 1990s. “There were more than 15 different database types, all competing for business,” he recalls. “In fact, there were so many options that determining how to house and access data was in itself a battle of uncertainty.” By the early 2000s, however, enterprises had begun to consolidate their data environments onto the leading relational database management systems—Oracle, IBM DB2, Microsoft SQL Server and Sybase, acquired by SAP in 2010. Today, the database wars are back, Fisher says.
The 'database wars' of the 1990s are back.
The rise of web-scale computing is also driving new generations of databases. “Ten or 15 years ago, the paradigm was to deploy relational databases, use a DBA, architect schemas, write database code to define the database interaction and build applications on top of the data models,” says Sehmuz Bayhan, senior director of data platform engineering and operations for PayPal. “This was a very secure approach, but it was not web scale. Enterprises need to scale to a massive deployment easily and quickly, and to integrate data and code in a seamless manner.”
Sensors
New databases and platforms are growing increasingly popular because they offer low-cost, low-complexity solutions for capturing and analyzing unstructured forms of data that haven’t been available until recently. “There’s sensor data, for example,” says Imhoff. “We’ve had RFID tags and machine sensor data for how many years? A decade now? But we’re just now beginning to figure out that we can actually do something with it. We’ve just we never had a mechanism to do anything with it efficiently. If we did, it took forever, and cost an arm and a leg. Now, we’ve got much more inexpensive databases and tools, such as Hadoop. Before, people said ‘we’ve got all this data, but we can’t store it and we can’t analyze it.’ Now, all of a sudden they’re realizing, ‘we can store it, and we can analyze it.’”