In addition, applications developed and deployed within this new agile environment “follow a different cadence than traditional application development,” said Richard Chart, chief scientist at ScienceLogic. Common characteristics include “early deployment and rapid, incremental change.”
At the same time, data and software may be finally freed of the constraints of underlying hardware and operating systems. “Now it’s just software and it can run on all sorts of hardware, perhaps even at the same time with complex orchestration layered over a variety of hardware,” said Sri Raghavan, director of data science and advanced analytics product marketing at Teradata. DevOps is now given a wider range of architectures to manage, and, while it increases their complexity, it also makes software applications more useful for a wider range of use cases, he noted.
Automation is also playing an increasing role in elevating data professionals’ roles in their businesses. Container technology is an example of the enhanced automation that can be introduced to data environments. “Kubernetes automation radically reduces the operational burden of provisioning, managing, and operating distributed data infrastructure for DBAs and data managers,” said Monte Zweben, CEO of Splice Machine. “It enables full dependency automation for provisioning, continuously monitors the health of the system, automatically remediates system inconsistencies, and can auto-scale the components as workloads change—both up and down.”
However, moving to more agile data environments may still be hampered by the organizational challenges that exist between infrastructure teams and data analytics teams, cautioned Mike Potter, CTO of Qlik. “They tend to have different skill sets and areas of focus and are mostly working in silos. To take best advantage of these new technologies, it’s important to develop an infrastructure strategy that blends the variety of skills amongst teams and keeps the lines of communication open. Every DBA and data manager need to think about the type of workloads required to service the business and how the new services will be implemented, and then deploy the infrastructure with the right blended team in mind.”
In this emerging world, data professionals may find themselves assuming greater roles in data engineering, said Joe Hellerstein, chief strategy officer for Trifacta and professor at the University of California–Berkeley. This means placing more attention on data quality, authoring, and maintaining data transformation pipelines and operationalizing those pipelines and managing metadata. “These issues take the most time and drain the most talent from our data professional pools, and they need to be modernized.” The good news, he said, is “that time spent on data engineering today often connects directly to business use cases. Even if the cloud vendors waved a wand and made all the problems of containers and microservices disappear, we would still need solutions that take in raw data and convert it into business value. That’s the primary role for data engineering in a healthy organization.”
Improving Data Delivery
Along with enabling increased involvement in the business, agile data environments are also speeding up the data insights and associated applications to businesses. Companies such as Amazon “deliver numerous software releases per second,” said Chris Bergh, CEO and founder of DataKitchen. “This would be impossible without environment and release automation, and technologies like infrastructure-as-code. Applying these capabilities in data organizations will transform enterprises. The world will be divided into companies with nimble and accurate analytics that facilitate insights and others who struggle to keep up.”