I recently met with Joe McKendrick, lead analyst at Unisphere Research, to discuss the current state and future of data management. Here’s an excerpt of our conversation.
Nihal Mirashi: Can you describe the database landscape in 2021 and beyond?
Joe McKendrick: The previous year was certainly an extremely eventful one on many levels, and this year is just as unsettled. We have seen timelines for digital transformation—supported by data analytics—suddenly accelerate from five-year horizons to overnight implementations. This acceleration continues unabated, with initiatives ranging from artificial intelligence to edge computing being put on even faster tracks. These initiatives all require enormous volumes of quality data, meaning data managers will remain quite busy well into the foreseeable future.
There are a lot of pieces that need to be kept together and managed, and that’s just the nature of the enterprise beast. The typical enterprise in our surveys at Unisphere Research now manages at least four different database brands, and the overwhelming majority tell us the degree of complexity and volumes of data at their sites are only increasing.
Vendors of the major databases—Microsoft (SQL Server), Oracle, IBM (Db2), and SAP (HANA)—have not been resting on their laurels either. They have been aggressively moving their client bases to the cloud, with robust infrastructure-as-a-service and platform-as-a-service offerings, as well as partnerships with pure-play cloud services. You can see this increased focus on cloud offerings among the NoSQL database vendors as well, such as MongoDB, DataStax, and Couchbase. NoSQL databases have risen from grassroots popularity to becoming part of the mainstream enterprise database landscape. At the same time, the competition in the cloud market is fierce. You need to keep innovating and meeting customers where their needs are heading.
We’re also seeing a greater push toward the automation of data management functions. This is being rolled out by the big database vendors, as well as smaller niche players, and it dovetails with cloud services. There has already been tremendous progress with automating low-level processes such as data standardization, deduplication, and parsing. Now, with machine learning on the rise, even more manual tasks can be performed by software, without human intervention. We’re seeing an increase in automation in the areas of storage, data cataloging, metadata management, reference data management, and data glossaries.
Many organizations are embracing data management automation to finally manage unsustainable processes, which not only is bringing the data flood under control but also freeing up data managers to play greater roles as innovation advocates within their organizations.
Nihal: Very interesting insights. What do you see as the top trends shaping up the direction of modern business applications and use of databases?
Joe: We’re seeing the convergence between databases and machine learning technologies that shape up the modern application—and this is more than a technical matter. It is enabling businesses to deploy and continuously improve applications at an exponentially faster rate than could be accomplished previously with manual work or even early-generation tools.
Also, there is greater adoption of sophisticated approaches to presenting data to the business such as interactive dashboards that accelerate insights for faster business decisions. But positioning data as a value-yielding asset requires changing your data culture—making advanced analytics one of your core business capabilities. I understand that Pure sponsored a study that shows the necessity of this change. Organizations that have mature analytics investments outperform their peers across multiple key areas, including operational efficiency, product delivery/time to market, business revenue growth, customer satisfaction, and customer retention. Organizations that ignore analytics investments risk getting left behind. The key to making this happen is to remove the latency caused by data movement between storage sites and compute engines.
We’re seeing the rise of next-generation platforms that are augmenting traditional platforms, and forward-looking organizations are actively replacing technologies with modern, self-service platforms to drive data democratization. Additionally, agile platforms are needed to capture and keep pace with rapidly growing, highly varied unstructured data.
We’re also seeing greater attention on two key methodologies—philosophies, if you will— DevOps and DataOps. DevOps, which is very common in enterprises these days, aligns and automates the work of development and operations teams, assuring these two groups are working in sync to deliver quality software in a consistent and rapid way.
DataOps, still relatively new on the scene, combines DevOps with lean manufacturing principles (the idea of a data factory, if you will) to assure a consistent flow of data operations and analytics. It automates the development, integration, testing, transformation, and delivery of data and analytics across enterprises.
Both DevOps and DataOps are proving to be critical in today’s business and technology climate, helping organizations respond swiftly and intelligently to the plethora of disruptions we’ve been facing. These methodologies offer, alongside new tools and processes, to increase data quality and reduce the time needed to generate insights.
Another trend shaping the database world is the rise of containers, which assures efficiency and productivity in the cloud. We are seeing accelerating adoption of container-enabled infrastructure, thanks to containers’ ability to foster rapid, automated deployment of applications at scale. Older, legacy infrastructures can stand in the way of getting the most from containers, so companies will increasingly upgrade IT architectures with technologies such as Kubernetes. Keep in mind that containerization at scale is incredibly complex and storage should be a major consideration. Storage should be container-smart and container-aware, with the resiliency and availability to support these highly distributed applications.
Nihal: Let’s switch gears and talk about the future of the data professional and career direction. For example, what do all of these changes mean for database administrators? What steps can they take to thrive in this fast-changing environment?
Joe: There’s never been a more challenging, yet more exciting time to be a database administrator, data manager, or anyone else connected with data. Today’s data managers need to become key advocates and advisors to the business, which requires ever-deeper collaboration with business colleagues.
Here are what I would consider the five habits of effective DBAs:
- Embrace automation. Now, more than ever, businesses are relying on IT departments to drive agility and innovation. However, for many DBAs, just keeping business-critical systems and applications running smoothly has become an increasingly difficult task. Database environments continue to grow in size and complexity while database team sizes at enterprises have remained relatively flat. DBAs need to fully embrace automation and become stewards of how that automation plays out when it comes to the routine, day-to-day tasks that take up valuable DBA time. They have the expertise to help companies document, standardize, and optimize these processes. This will also free up DBA time to tackle initiatives that more directly support and add value to business initiatives whether those are related to data governance, data architecture, or data analytics.
- Embrace cloud. Another area that DBAs should become experts and influencers with management is cloud technologies. There are so many technical considerations that go into migrating databases to the cloud—security, compliance, performance, licensing costs, egress charges—just to name a few. A big part of succeeding in these initiatives is also thoroughly knowing your data and your technical requirements. DBAs can play a valuable role in not just evaluating but continuing to manage and tweak these environments to ensure they are being utilized in a way that supports the business goals. Both cloud and automation represent an opportunity for DBAs to shift up in the value chain. With this, there is a need for knowledge and understanding of the key differences between managing data locally and managing data remotely. You simply can’t move quickly enough to keep up with the changes in the outside world without direct knowledge of cloud-based offerings.
- Embrace collaboration and remote work. Over the past year, web conferencing, remote collaboration, and online learning became the primary way of doing business at many companies. This extends to the way data moves through organizations as well. Tasks for moving data from raw to analytics-ready settings will become more engaging, fast, and iterative, moving at a real-time clip. In addition, managing remote workforces shaped enterprise priorities over the past year and defines the tasks ahead. As work keeps changing, the way we’re dealing with data is changing right along with it. We’re seeing more reliance on cloud-based, or at least cloud-accessed, data and data management. We are seeing a rapid acceleration of remote management tools to support secure backup and monitor employees working outside the corporate walls. Data managers need to support this new workforce through automation along with remote management.
- Embrace digital transformation. DBAs and data managers need to take a leadership role in digital transformation. The role of data managers is more important than ever because data analytics is at the core of every digital transformation effort. There simply can’t be digital transformation without data. It requires data and lots of it. We conducted a survey at the end of 2020 that found that the COVID-19 crisis did not slow things down at all. If anything, it acted as an accelerant to these digital transformation and cloud efforts.
- Embrace system diversity. Data becoming so pervasive and complex, with users from different domains executing queries from so many different domains, requires simple-to-deploy and easy-to-use storage but with higher performance than what relational database solutions offer. With this growth of database activity and proliferation of open source databases, the challenge for data managers is assuring the performance and availability of a wide range of solutions from varying providers. Data managers—database administrators, developers, and analysts—are not experts in multiple environments. Oracle databases, for example, have different protocols than MariaDB databases. This complicates important elements such as overall performance, availability, backup and recovery, and disaster recovery.
Stay tuned for part 2 of this interview, coming soon!
Sponsor message:
Build Your Pure Knowledge at Pure//PEAK Week 2021. Join us June 22 and 24 to develop your individual capabilities and knowledge to support success with Pure products.Pure Storage delivers a Modern Data Experience™ that empowers organizations to run their operations with a true, automated, storage-as-a-service model seamlessly across multiple clouds. Pure helps customers put data to use while reducing the complexity and expense of managing the infrastructure behind it. Get consistent, predictable performance and rapid response from your Microsoft, Oracle, SAP, and open source web-scale applications. Save on operating costs with the industry’s best data reduction. Benefit from 99.9999% availability to help put an end to planned—and unplanned—downtime.
Accelerate Business Applications
Streamline database operations and optimize your workloads for better business outcomes.
Learn more at www.purestorage.com/solutions/applications/optimize-workloads.html