Increasingly, DBAs are seeing artificial intelligence (AI), and machine learning applied to database management and optimization, taking self-healing and self-tuning to the next level. These solutions, from both database and third-party vendors, allow DBAs to spend less time searching for bottlenecks, and more time doing more productive and creative work in support of strategic business goals.
What are AI and Machine Learning?
AI and machine learning have entered the mainstream in the last couple of years. Here are some brief descriptions to help understand this landscape before exploring how these technologies can benefit DBAs:
- AI covers anything where a machine imitates certain “cognitive” human functions such as learning and problem solving. Examples include
automatic trading systems, autonomous cars and intelligent routing, and delivery systems. - Machine Learning is a subset of AI that uses statistical techniques to allow computers to model and predict outcomes using datasets. Examples
include email filtering, fraud detection, and ranking systems to drive online marketing. - Deep Learning is a specific class of machine learning that uses artificial neural networks, as opposed to machine learning task-oriented algorithms. Examples include computer vision, speech recognition, and natural language processing.
How AI Research Has Contributed to Autonomous DBMS Capabilities
Imagine that your database system (DBMS) could anticipate operational issues and take prescriptive actions to prevent them by allocating additional resources, adding or dropping indexes, or automatically adjusting query execution plans. This is the idea behind autonomous
databases and machine learning, which can predict when a problem may occur and either warn the DBA or automatically take action.
Over the last 5–10 years, academic research has explored machine learning process pipelines for tuning database management systems (DBMS) automatically.
Two examples of this research are:
- OtterTune, a project from the Carnegie Mellon Database Research Group, leverages data collected from previous DBMS workloads to tune new ones using a machine learning pipeline. It uses this data to build machine learning models that capture how the DBMS responds todifferent configurations. OtterTune uses these models to guide experimentation for new applications and recommend settings that improve a target objective, such as reducing latency or improving throughput.
- DBSeer is a free, open source framework developed by MIT that uses machine learning and statistical regression techniques to identify bottleneck resources and predict performance for a given set of resources.
What AI Capabilities Are Being Integrated Into Database Products?
DMBSs that handle large amounts of data and operate complex workloads are difficult to manage because they have hundreds of configuration settings that require experts to administer.
Companies such as Microsoft and Oracle have started to incorporate AI and machine learning in databases to enable “autopilot” monitoring, and help DBAs proactively address problems caused by mistuned databases.
Some examples include:
- Microsoft Azure SQL Database Advisor provides custom performance tuning recommendations and suggests performance improvements that can be automatically applied.
- Microsoft SQL Server Query Store monitors query usage and provides an adaptive plan to help identify all queries where execution plan metrics have degraded, ultimately reporting on them or regressing them.
- Oracle 18c Autonomous Database automatically upgrades, patches, and tunes itself by learning what “normal” looks like to help eliminate complexity, human error and manual management.
What MLFeatures Can Be Used by DBAs to Benefit Businesses?
Python and R have become popular languages in developing machine learning applications, with database vendors actively
integrating them to provide statistics, graphic representation and data analysis.
As a result, DBMS technology has become a catalyst for implementing machine learning solutions that can help businesses make
better decisions. Some examples of this are:
- Microsoft SQL Server Machine Learning Server and R Server are standalone server examples that run independently of SQL Server database engine instances and provide a development environment with distributed and parallel processing of R and Python workloads over small-tolarge datasets, using the proprietary packages and calculation engines installed with the server.
- Microsoft SQL Server Machine Learning and R Services are in-database installations of machine learning that operate within the context of a SQL Server database engine instance, providing R and Python external script support for resident data in SQL Server instances, and keeping analytics close to the data to eliminate the costs and security risks associated with data movement.
- Oracle R Enterprise is a component of the Oracle Database Enterprise Edition “Advanced Analytics Option” that makes the R language and environment ready for analyzing data by allowing users to run R commands and scripts for statistical, machine learning and graphical analysis on data stored in the Oracle Database.
While these integrations provide many benefits, certain essential DBA job functions uch as performance tuning will still require human supervision to ensure they are properly secured and that any machine learning scripts do not compromise database performance. DBAs are also critical in helping to coordinate the needs of data scientists, application developers, architects, and data engineers throughout the data lifecycle.
What is the DBA Role of the Future?
The explosion of IoT devices and massively streamed real-time data will challenge future DBAs to manage the storagend processing of big data, and protect data privacy and data sovereignty while extracting value from this data.
DBAs should feel empowered by new, emerging AI and machine learning applications because data has never been more
important, and no individual has a better understanding of how to manage and gather that data than the DBA.
The adoption of cloud databases and hybrid infrastructures to handle this data explosion and scalability requirements will place new demands on DBAs while enabling them to automate resource demand, performance monitoring, tuning indexes, and query execution, and to strengthen auto
detection of threats and vulnerabilities.
In the end, all of these trends will contribute to the evolving role of the DBA as a strategic contributor to business success.