New technologies such as Hadoop provide enterprises with another option besides the enterprise data warehouse when it comes to their data storage. Nonetheless, data warehouses still provide value to companies. What you need to know is when to use your data warehouse, when to use Hadoop, the advantage of using both in your information supply system, and the key strategies for success. These issues were examined in a recent DBTA roundtable webcast that is now available for replay.
The webcast was presented by Kevin Petrie, senior director, Attunity; George Corugedo, CTO & co-founder, RedPoint Global Inc.; and Nitin Bandugula, senior product marketing manager, MapR.
There are certain times to use data warehouses and certain instances for which Hadoop is best suited, pointed out Attunity's Petrie. “Data warehouses are used for structured tables, integrated data, single version of truth, and centralized processing. On the other hand, Hadoop allows for structured and unstructured data that can be stored in its raw format. It is very cost effective hardware, highly scalable, and distributed processing,” noted Petrie.
According to a recent TDWI survey, the top two reasons for change in database architecture is the scale of analytics and the speed of analytics, said Bandugula of MapR. Bandugula added that enterprise data warehouses are often not being used efficiently, and 70% of data is unused. Hadoop is able to provide storage with more efficient cost models and also able work with more data sources, he said.
Data quality issues including integration and matching are very important to data management, said Corugedo. “Data quality is all about being able to have data in the proper format,” he said. It can be as simple as making sure an email is valid or making sure a phone number is in the same structure as the rest of the phone numbers. Integration and matching, he noted, is critical in order to keep data relationships accurate.
To view an on-demand replay of "Hadoop and Your Enterprise Data Warehouse," go here.