Every few years we hear of one new idea or product that will surely bring death to the relational database, or death to the data warehouse, or death to something. It appears that many prefer to see death or at least they greatly enjoy planning for it. Often these finalities never seem to arrive. These paradigms shifts are heralded, surely not just to sell the latest product or lecture series; but at the end of the day the world has simply turned, as usual, still on its axis. The latest paradigm’s experts may come in and try to build Rome anew, but somehow the new king is much like the old king.
Some think the data lake circumvents a need for a data warehouse. They believe there is no longer any need for an organized data warehouse bringing things together. Certainly, data scientists have their focus on data lakes, to churn out ever newer statistical models. The data lake supports the needs they have, and that is good. The data lake has become another piece of the overall puzzle for business intelligence. But at the same time this ascension need not be the demise of anything else.
Those who use standard reports and more simplistic analyses cling strongly to their data marts and have no desire to start learning how to code in R or Python. Just because one group has a new tool that enhances their ability to find new insights, the other groups doing what they have always done have not gone away. And that is what the data scientists may miss the mark on; they are not the only invitees to the party.
Likewise, those who may perceive the physical data warehouse as the center of the universe, are equally myopic. The idea that absolutely everything from every system must be flushed through a centralized data warehouse before anyone should use or touch it for analysis is an equally antiquated idea. Some data truly is too large to move about.
Alternatively, some systems are too small with too few users to bother with efforts to properly decompose and integrate into a massive data warehouse. The rise of the logical data warehouse paradigm is also useful. Some pieces can be “integrated” logically rather than physically and work quite well. But again, this is not a one-size-fits-all world. Just because a piece can be better integrated logically does not by default mean that all pieces should be integrated logically and that the physical data warehouse must die.
Perhaps everyone must simply take a step back and breathe. New ideas are good. But not every new idea replaces every old idea. Particularly with business intelligence, there are many needs coming from many differing types of groups. There is no single thing or only one approach that addresses every last possible need that may exist. A modern business intelligence solution incorporates multiple pieces, i.e., maybe a data lake, a data warehouse, multiple data marts, logical data warehouse extensions via data virtualization, and even a few more bells and whistles. The data warehouse remains a crucial component of a broad ecosystem for an enterprise’s data, and a necessary component, regardless of what some may believe. Architecture, when done well, provides places for the different kinds of work that needs to get done, with homes to support the various approaches that accomplish that work. A place for everything and everything in its place. The data warehouse has a place.
Those wishing the data warehouse away seem to suffer from a narrow focus, seeing a tree and not the forest. These naysayers are like frogs on a rainy night, they continue to croak throughout the night. But as of today, the data warehouse is not going anywhere.