Oracle recently announced the availability of the Oracle Cloud Data Science Platform with Oracle Cloud Infrastructure Data Science at the core, helping enterprises to collaboratively build, train, manage, and deploy machine learning models.
The goal with Oracle's Cloud Infrastructure Data Science is to improve the collaboration and effectiveness of data science teams with capabilities such as shared projects, model catalogs, team security policies, reproducibility, and auditability. Oracle Cloud Infrastructure Data Science automatically selects the most optimal training datasets through AutoML algorithm selection and tuning, model evaluation and model explanation.
“Effective machine learning models are the foundation of successful data science projects, but the volume and variety of data facing enterprises can stall these initiatives before they ever get off the ground,” said Greg Pavlik, senior vice president product development, Oracle Data and AI Services. “With Oracle Cloud Infrastructure Data Science, we’re improving the productivity of individual data scientists by automating their entire workflow and adding strong team support for collaboration to help ensure that data science projects deliver real value to businesses.”
Designed for data science teams and scientists, Oracle Cloud Infrastructure Data Science includes automated data science workflow, saving time and reducing errors with the following capabilities:
- AutoML automated algorithm selection and tuning automates the process of running tests against multiple algorithms and hyper-parameter configurations.
- Automated predictive feature selection simplifies feature engineering by automatically identifying key predictive features from larger datasets.
- Model evaluation generates a comprehensive suite of evaluation metrics and suitable visualizations to measure model performance against new data and can rank models over time to enable optimal behavior in production.
- Model explanation provides automated explanation of the relative weighting and importance of the factors that go into generating a prediction.
The Oracle Cloud Data Science Platform includes seven new services to accelerate and improve data science results:
- Oracle Cloud Infrastructure Data Science enables users to build, train and manage new machine learning models on Oracle Cloud using Python and other open-source tools and libraries including TensorFlow, Keras and Jupyter.
- Machine learning algorithms are tightly integrated in Oracle Autonomous Databasewith new support for Python and automated machine learning. Upcoming integration with Oracle Cloud Infrastructure Data Science will enable data scientists to develop models using both open source and scalable in-database algorithms.
- Oracle Cloud Infrastructure Data Catalog allows users to discover, find, organize, enrich and trace data assets on Oracle Cloud. Oracle Cloud Infrastructure Data Catalog has a built-in business glossary making it easy to curate and discover the right, trusted data.
- Oracle Big Data Service offers a full Cloudera Hadoop implementation, with dramatically simpler management than other Hadoop offerings, including just one click to make a cluster highly available and to implement security. Oracle Big Data Service also includes machine learning for Spark allowing organizations to run Spark machine learning in memory with one product and with minimal data movement.
- Oracle Cloud SQL enables SQL queries on data in HDFS, Hive, Kafka, NoSQL and Object Storage. Only CloudSQL enables any user, application or analytics tool that can talk to Oracle databases to transparently work with data in other data stores, with the benefit of push-down, scale-out processing to minimize data movement.
- Oracle Cloud Infrastructure Data Flow, a fully-managed big data service that allows users to run Apache Spark applications with no infrastructure to deploy or manage. It enables enterprises to deliver Big Data and AI applications faster. Unlike competing Hadoop and Spark services, Oracle Cloud Infrastructure Data Flow includes a single window to track all Spark jobs making it simple to identify expensive tasks or troubleshoot problems.
- Oracle Cloud Infrastructure Virtual Machines for Data Science enable preconfigured GPU-based environments with common IDEs, notebooks and frameworks.
For more information, go to www.oracle.com.