Features of In-Database Algorithms
Oracle Machine Learning offers a suite of tools enhancing productivity for data scientists, developers, and data engineers. This suite streamlines machine learning model development, evaluation, and deployment, catering to both experts and non-experts in the field.
-
In-database Machine Learning:
- Perform ML operations directly within Oracle Database without exporting data to separate ML engines. This approach eliminates data movement, ensuring efficiency and data security.
- Oracle uses parallelized and distributed algorithms, scaling seamlessly across cluster nodes for faster processing.
- It optimizes memory usage and leverages Exadata’s storage-tier function push-down for high-speed scoring.
-
Scalability and Deployment:
- Perform batch and real-time predictions using OML’s scalable architecture.
- Use prediction operators in SQL queries or use them directly with programming languages like Python and R.
- OML on Autonomous Database Serverless supports no-code deployment through REST interfaces, making deployment accessible to users with varying technical skills.
-
Machine Learning Models as First-Class Database Objects:
- Manage models with database-level access control, ensuring secure handling.
- Track user actions through auditing, providing insights into usage and changes.
- Export and import models between databases for efficient sharing and reuse.
- Leverage database tools for backup, recovery, and secure storage of ML models.
-
Data Preparation:
Automate key steps like cleansing, filtering, normalizing, and sampling. Most data must be cleansed, filtered, normalized, sampled, and transformed in various ways before it can be mined. Up to 80% of the effort in a machine learning project is often devoted to data preparation.
-
Text Processing:
- Extract useful information from unstructured text, transforming it into structured data using machine learning techniques.
- Text tokens or features allow querying and deriving insights from text data to address business challenges effectively.
-
Partitioned Models:
- Divide data into subsets based on characteristics to organize multiple models efficiently.
- Use partitioning to manage diverse data sets while maintaining clarity and improving model management.
-
Faster Time-to-Market Solutions:
- Deploy ML models instantly with SQL prediction operators and REST interfaces.
- Run predictions directly from R or Python environments without additional tools.
- Simplify deployment on Autonomous Database Serverless to deliver actionable insights quickly, streamlining workflows for both experts and non-experts.
Topics:
- Automatic Data Preparation
Machine learning models often require data transformations before training. Oracle Machine Learning (OML) automates this process using Automatic Data Preparation (ADP). ADP applies to OML4SQL, OML4Py, and OML4R in-database models, making data transformation easier. - Integrated Text Mining
Integrated text mining in OML allows you to perform text analysis directly within the Oracle Database using SQL and PL/SQL. This integration enables you to extract meaningful insights from unstructured text data without the need to move data outside the database environment. - About Partitioned Models
Partitioned models allow you to divide your data set into multiple partitions based on specific attributes and build a model for each partition. The system automates the creation and management of these models, reducing manual effort.
Parent topic: What is In-Database Machine Learning