Moving to the cloud may only be a matter of keeping up with the rest of the environment. “Data, regardless of the type, is already on the cloud,” said Raghavan. “Every enterprise customer that I have worked with has petabytes of information stored in the cloud and has multiple clouds for disaster recovery, fault resistance, geographical duplication, as well as workload segmentation, among other things.” However, there are still reasons why a part of an enterprise’s workload will continue to stay on-premise, he continued. Security, for one, may require keeping data within the enterprise walls. This does not assume that there are no risks of security breaches in the on-premise world, Raghavan stressed. It is just that companies feel a bit more at ease with the perception of complete control of their on-premise installations.
There are other reasons to at least keep analytics data on-premise as well—such as “data transit costs, latency for real-time apps, and cost at scale,” said VanHook. “This is true in cases where compliance requires complete control of the data infrastructure, as well as micro data centers on the edge that preprocess substantial amounts of data from remote locations.” On-premise warehouses are also still desirable for large-scale, purpose-built, high-performance infrastructures for specialized applications, VanHook said. “Examples of this include a real-time data warehouse for a customer-facing service and the real-time routing app for a delivery service.”
Still, the mass migration of data environments continues toward the cloud. While on-premise data management “offers the ultimate control over the location of data, hardware, software, and who has access, it can be expensive and difficult to scale,” said Kundavaram. On-premise data management requires constant maintenance, time-consuming upgrades, and, potentially, outage management, Kundavaram continued. “On the other hand, companies that operate their data lakes and data warehouses in the cloud see a nimbler experience. Operating in the cloud allows businesses to accumulate massive amounts of data and scale their operations accordingly. Data lakes and warehouses that operate in the cloud can manage multiple data streams simultaneously while backing up this data automatically.”
WHAT’S AHEAD
Automation is making data analytics better. Looking into the immediate and longer-term future, convergence between analytical data platforms will continue, enhanced by automation capabilities, industry observers state. “Automation is drastically changing the ways that we consume and process data,” said Adams. “Machine learning mitigates many of the difficulties associated with applying data science and review principles when processing and storing data. If machine learning gets to the point where information is streamed while processed, data lakes will transition into data warehouses where analytics can be conducted in real time. Effectively, machine learning holds the potential to structure the unstructured and open up a world of new analytic potential.”