Oracle has announced that Oracle MySQL HeatWave now supports in-database machine learning (ML) in addition to the previously available transaction processing and analytics. MySQL HeatWave ML fully automates the ML lifecycle and stores all trained models inside the MySQL database, eliminating the need to move data or the model to a machine learning tool or service. Eliminating ETL reduces application complexity, lowers cost, and improves security of both the data and the model. HeatWave ML is included with the MySQL HeatWave database cloud service in all 37 Oracle Cloud Infrastructure (OCI) regions.
Until now, Oracle said, adding machine learning capabilities to MySQL applications has been prohibitively difficult and time-consuming for many developers.
MySQL HeatWave ML solves many of the problems involved in doing so by natively integrating machine learning capabilities inside the MySQL database, eliminating the need to ETL the data to another service. HeatWave ML fully automates the training process and creates a model with the best algorithm, optimal features, and the optimal hyper-parameters for a given data set and a specified task. All models generated by HeatWave ML can provide model and prediction explanations.
“Just as we integrated analytics and transaction processing within a single database, we are now bringing machine learning inside MySQL HeatWave,” said Edward Screven, chief corporate architect, Oracle. “MySQL HeatWave is one of the fastest growing cloud services at Oracle. An increasing number of customers have migrated from Amazon and other cloud database services to MySQL HeatWave and have gained significant performance improvements and lower costs. Today, we are also announcing a number of other innovations which enrich HeatWave’s capabilities, improve availability, and lower the cost.”
HeatWave ML offers the following capabilities:
Fully Automated Model Training: All the different stages in creating a model with HeatWave ML are fully automated and do not require any intervention from developers. This results in a tuned model which is more accurate, requires no manual work, and the training process is always completed.
Model and Inference Explanations: Model explainability helps developers understand the behavior of a machine learning model. For example, if a bank denies a client a loan, the bank needs to be able to determine which parameters of the model were taken into account, or if the model contains any bias. Prediction explainability is a set of techniques that help answer the question of why a machine learning model made a specific prediction. Prediction explanations are becoming increasingly important these days as companies must be able to explain the decisions made by their machine learning models. HeatWave ML integrates both model explanation and prediction explanations as a part of its model training process. As a result, all models created by HeatWave ML can offer model as well as inference explanations without the need of training data at inference explanation time. Oracle has augmented existing explanation techniques to improve performance, interpretability, and quality.
Hyper-Parameter Tuning: HeatWave ML implements a new gradient search-based reduction algorithm for hyper-parameter tuning. This enables the hyper-parameter search to be executed in parallel without compromising the model accuracy. Hyper-parameter tuning is the most time-consuming stage of ML model training, and this unique capability provides HeatWave ML with a significant performance advantage over other cloud services for building machine learning models.
Algorithm Selection: HeatWave ML uses the notion of proxy models—which are simple models exhibiting the properties of a full complex model—to determine the best ML algorithm for training. Using a simple proxy model, algorithm selection is done very efficiently without loss of accuracy. No other database services for building machine learning models have this proxy modeling capability.
Intelligent Data Sampling: During model training, HeatWave ML samples a small percentage of the data in order to improve performance. This sampling is done in such a manner that all representative data points are captured in the sample data set. Other cloud services for building machine learning models take a less efficient approach—using random data sampling—which samples a small percentage of data without considering the data distribution characteristics.
Feature Selection: Feature selection helps determine the attributes of the training data which influence the machine learning model behavior for making predictions. The techniques in HeatWave ML for feature selection have been trained over a broad swath of data sets across multiple domains and applications. From these gathered statistics and meta information, HeatWave ML is able to efficiently identify the relevant features in a new data set.
In addition to machine learning capabilities, Oracle released more innovations to the MySQL HeatWave service. Real-time elasticity enables customers to upsize and downsize their HeatWave cluster to any number of nodes, without any downtime or read-only time, and without the need to manually rebalance the cluster. Also included is data compression, which enables customers to process twice the amount of data per node and lowers costs by nearly 50 percent, while maintaining the same price performance ratio. Finally, a new pause-and-resume function enables customers to pause HeatWave to save costs. Upon resuming, both the data and the statistics needed for MySQL Autopilot are automatically reloaded into HeatWave.
More information is available at www.oracle.com/mysql/heatwave