With containers, if it is easy to move databases back and forth between on-prem and the cloud environments, “perhaps developers will check entire databases in and out of the cloud for the day,” said Fuchs. In this way, he said, if they need high-speed data access, they can check it out to on-prem, and then, when they are done, check it back into the cloud.
Such new approaches can help businesses gain meaningful insights from data—“allowing for a more strategic and competitive advantage,” Brey said. “There is no longer a highly static and exclusive relationship between databases and storage. With data in motion, automated pipelines enable the fluid mobility of data from edge to core. But organizations are struggling with the volume and variety of data they have available to them. By implementing new methodologies—data in action via analytics—AI and machine learning will enable organizations and large enterprises to easily extract the value from their data, convert it into actionable insights, and formulate strategic business decisions.”
The emerging agile data environment also will promote greater collaboration and interaction well beyond the boundaries of the data center. “Data engineering cloud environments enable a diversity of stakeholders to work together using data to transform business processes,” said Hellerstein. “Getting this right organizationally requires a big-tent approach that connects data experts and domain experts, hand-coders and no-coders, as well as engineering discipline and business agility. Technical innovations that allow those constituencies to work together seamlessly will be super-high leverage. The innovations I’d look for are the ones that make things simple.”
Greater collaboration lends itself to greater visibility across the data-driven enterprise as well. The “meta-orchestration” made possible through agile processes and solutions “will enable collaboration and innovation while enforcing security and governance policies,” said Bergh. “Observability will add testing, logs, and metrics that describe every nuance of analytics creation and operations. This will give DBAs unprecedented visibility into the real-time operation of data pipelines and team productivity.”
Such techniques and technologies could even begin to alter the very definition of databases themselves. The increasing sophistication of scalable compute platforms—all of which are either containerized or serverless—now goes beyond their core analytics and machine learning capabilities so these platforms can also act as databases, said Edgar Honing, data science and data engineering consultant with AHEAD. “Such platforms provide for open connectivity and can be extended to support a variety of data sources. They now act as data integration platforms, blurring the line between databases and machine learning and analytics platforms.
Reshaping Data Jobs
The agile data approaches emerging—containers, microservices, DevOps, and DataOps—will have a direct impact on the roles of DBAs and data managers, industry observers agree. DataOps, which is enabled by DevOps, is particularly important to data management activities going forward, said Chris Bergh, CEO of DataKitchen. “DataOps micro-services-based data architectures will enable greater collaboration within the data organization by creating sharable components that are reused throughout teams and organizations,” he pointed out. “DataOps also incorporates DevOps capabilities which will integrate, deliver, and deploy new analytics into production using automated orchestrations. DataOps orchestration will be an essential part of creating analytics efficiently and with minimal manual effort.”