3 Big Data and Data Warehousing Solutions
Analytic Views
Analytic View Enhancements to SQL and PL/SQL
SQL and PL/SQL enhancements for analytic views are the following:
- Star views can be created on fact and dimension tables
- Analytic views are created automatically for queries of the tables and views
- Level group caches are created and maintained autonomously
- Query-scoped base measures
- Support for remote sources
- Support for aggregate tables
- Star support for dimension caches
With these enhancements:
- Star view and star table queries can use the benefits of analytic views without any code changes
- Front-end tools can use analytic view calculations by inserting a new expression
- Level group caches can become more efficient over time based on usage patterns
Related Topics
Parent topic: Analytic Views
Analytical SQL and Statistical Functions
- Bitwise Aggregate Functions
- Enhanced Analytic Functions
- New Analytical and Statistical Aggregate Functions
Parent topic: Big Data and Data Warehousing Solutions
Bitwise Aggregate Functions
New aggregate functions BIT_AND_AGG
,
BIT_OR_AGG
, and BIT_XOR_AGG
enable
bitwise aggregation of integer columns and columns that can be
converted or rounded to integer values.
Bitwise aggregation functions enable bitwise type processing directly in SQL. Use of these new functions improves overall query performance by eliminating unnecessary data movement and by taking full advantage of other database capabilities such as parallel processing.
Related Topics
- Oracle® Database SQL Language Reference
- Use Bitwise Aggregate Functions lab in the Database 21c New Features workshop in LiveLabs
Parent topic: Analytical SQL and Statistical Functions
Enhanced Analytic Functions
Window functions now support the EXCLUDE
options of
the SQL standard window frame clause. The query_block clause of a
SELECT
statement now supports the
window_clause
, which implements the window clause of
the SQL standard table expression as defined in the SQL:2011
standard.
Supporting the full ANSI standard enables easier migration of applications that were developed against other standard-compliant database systems.
Related Topics
- Oracle® Database Data Warehousing Guide
- Use Enhanced Analytic Functions lab in the Database 21c New Features workshop in LiveLabs
Parent topic: Analytical SQL and Statistical Functions
New Analytical and Statistical Aggregate Functions
New analytical and statistical aggregate functions are available in SQL:
CHECKSUM
computes the checksum of the input values
or expression.
KURTOSIS
functions KURTOSIS_POP
and
KURTOSIS_SAMP
measure the tailedness of a data set
where a higher value means more of the variance within the data set
is the result of infrequent extreme deviations as opposed to
frequent modestly sized deviations. Note that a normal distribution
has a kurtosis of zero.
SKEWNESS
functions SKEWNESS_POP
and
SKEWNESS_SAMP
are measures of asymmetry in data. A
positive skewness is means the data skews to the right of the
center point. A negative skewness means the data skews to the
left.
All of these new aggregate functions support the keywords
ALL
, DISTINCT
, and
UNIQUE
.
ANY_VALUE
, a function to simplify and optimize the performance of
GROUP BY
statements, returns a random value in a group and is optimized to
return the first value in the group. It ensures that there are no comparisons for any incoming
row and eliminates the necessity to specify every column as part of the GROUP
BY
clause.
With these additional SQL aggregation functions, you can write more efficient code and benefit from faster in-database processing.
Related Topics
- Oracle® Database SQL Language Reference
- Detect Data Tampering with the CHECKSUM Function, Measure Asymmetry in Data with the SKEWNESS Functions, and Measure Tailedness of Data with the KURTOSIS Functions labs in the Database 21c New Features workshop in LiveLabs
Parent topic: Analytical SQL and Statistical Functions
Machine Learning for SQL
- Adam Optimization Solver for the Neural Network Algorithm
- Oracle Machine Learning MSET-SPRT Algorithm
- Oracle Machine Learning XGBoost Algorithm
Parent topic: Big Data and Data Warehousing Solutions
Adam Optimization Solver for the Neural Network Algorithm
Adam is an optimization solver for the Neural Network algorithm that is computationally efficient, requires little memory, and is well suited for problems that are large in terms of data or parameters or both.
Adam is a popular extension to stochastic gradient descent. It uses mini-batch optimization and can make progress faster while seeing less data than the other Neural Network optimization solver, Limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) with line search.
Related Topics
Parent topic: Machine Learning for SQL
Oracle Machine Learning MSET-SPRT Algorithm
The Multivariate State Estimation Technique-Sequential
Probability Ratio Test (MSET-SPRT
) algorithm is a
nonlinear, nonparametric anomaly detection technique for monitoring
critical processes.
The DBMS_DATA_MINING.ALGO_MSET_SPRT
algorithm
detects subtle anomalies while producing minimal false alarms. It
calibrates expected behavior from available, historical data of the
normal operational sequence of monitored signals. It incorporates
the learned behavior of the system into a persistent
MSET-SPRT
model. You can apply the model to new
records to detect anomalous behavior.
Related Topics
Parent topic: Machine Learning for SQL
Oracle Machine Learning XGBoost Algorithm
XGBoost is a highly-efficient, scalable gradient tree boosting machine learning algorithm for regression and classification.
The DBMS_DATA_MINING.ALGO_XBGOOST
algorithm
prepares training data, builds and persists a model, and applies
the model for prediction. You can use it as a stand-alone predictor
or incorporate it into real-world production pipelines for a wide
range of problems such as ad click-through rate prediction, hazard
risk prediction, web text classification, and so on.
Related Topics
Parent topic: Machine Learning for SQL
Machine Learning for Python
- Oracle Machine Learning for Python (OML4Py)
- Oracle Machine Learning for Python Configuration in DBCA
Parent topic: Big Data and Data Warehousing Solutions
Oracle Machine Learning for Python (OML4Py)
Oracle Machine Learning for Python (OML4Py) enables the open source Python programming language and environment to operate on database data at scale. Python users can run Python commands and scripts for statistical analysis and machine learning on data stored in Oracle Database.
With OML4Py, you can do the following:
- Use a wide range of in-database machines learning algorithms
- Minimize data movement
- Leverage Oracle Database as a high performance compute engine for data exploration and preparation
- Use AutoML for automatic algorithm selection, feature selection, and model tuning
- Execute user-defined Python functions in non-parallel, data-parallel, and task-parallel fashion
Related Topics
Parent topic: Machine Learning for Python
Oracle Machine Learning for Python Configuration in DBCA
The Database Configuration Assistant (DBCA) supports configuring Oracle Machine Learning for Python (OML4Py).
Configuration through DBCA is supported for container databases and pluggable databases.
Related Topics
Parent topic: Machine Learning for Python
Query Optimization
In-Memory Vectorized Joins
The In-Memory vectorized joins feature is based on the deep vectorization framework. Using SIMD vector processing, the framework optimizes aspects of hash joins such as hashing, building, probing, and gathering. This optimization can improve the performance of join processing by 100% or more. The In-Memory vectorized joins feature is transparent to the user, requiring no plan changes.
In-Memory deep vectorization is a SIMD-based query processing framework that supports vectorization for higher-level query operators in the query plan. The framework includes optimizations such as SIMD, hardware acceleration, and pipelined execution.
Joins can account for a large percentage of SQL execution time for data warehouse workloads. Improving performance of hash joins by 100% or more can significantly improve performance.
Related Topics
Parent topic: Query Optimization
Spatial and Graph
- Property Graph: Graph Server and Client Kit
- Property Graph: Native Python Client
- Property Graph: New Features in PGQL
- Property Graph: Optimized Graph Representation for Faster Performance
- Property Graph: User-defined Graph Algorithms
- Property Graph Visualization
- RDF Graph: Native Unicode Storage and Processing
- Self-service Low-code Spatial Studio
- Spatial Network Data Model Contraction Hierarchy
- Spatial Support for Database In-Memory
Parent topic: Big Data and Data Warehousing Solutions
Property Graph: Graph Server and Client Kit
You no longer have to copy the Property Graph in-memory
analytics server (PGX) and client tools and libraries from
$ORACLE_HOME
.
With the simplified packaging and the availability of the Oracle Graph Server and Client kit on OCI Marketplace, application developers can quickly and securely install and deploy the components required to work with Property Graphs. This makes it easier to start developing applications.
Related Topics
Parent topic: Spatial and Graph
Property Graph: Native Python Client
Property Graph has a native Python API that allows you to create a graph, run graph queries (using PGQL), and analyze the graph using Python and Jupyter notebooks.
This makes it easier to get started with graph and simplifies integration in data science applications where Python and Jupyter are especially popular.
Related Topics
Parent topic: Spatial and Graph
Property Graph: New Features in PGQL
You can now do graph DDL and graph DML operations with the graph query language PGQL. Additionally, you can use PGQL for CHEAPEST path queries using cost functions.
This simplifies development by eliminating the need for configuration files and Java APIs when doing several common graph operations.
Related Topics
Parent topic: Spatial and Graph
Property Graph: Optimized Graph Representation for Faster Performance
The In-memory analytics server (PGX, or Property Graph AnalytiX) has an optimized representation of a property graph that uses less memory. Larger graphs can be analyzed in the same amount of memory.
Graph sizes are continuously growing larger. With this optimization you can analyze larger graphs using less memory than previously required. This not only enables analysis of more data, but also reduces system costs. The optimized graph representation gives you faster performance and is transparent; existing applications will run faster with no change.
Related Topics
Parent topic: Spatial and Graph
Property Graph: User-defined Graph Algorithms
You can create or extend graph algorithms using Java syntax, in addition to the dozens of pre-built graph analytics that come with the product. These user-defined algorithms will execute as fast as native algorithms in the product because they are compiled with the same optimizations.
For unique and specialized use cases, customizing graph algorithms lets you add analysis that analysts and data scientists design specifically for your applications.
Related Topics
Parent topic: Spatial and Graph
Property Graph Visualization
A rich set of visualization features lets you interactively explore the graph, customize layouts, and highlight interesting relationships in your data.
Seeing graph data and relationships visually lets analysts, data scientists, and developers quickly understand and explore clusters, outliers, anomalies, patterns, communities, and critical connections in their data. This makes you more productive and helps you share and communicate results more clearly.
Related Topics
Parent topic: Spatial and Graph
RDF Graph: Native Unicode Storage and Processing
When creating the RDF (Resource Description Framework) network, you can now store RDF data in a native unicode format for virtually all use cases.
This reduces the storage required and enhances query performance.
Related Topics
Parent topic: Spatial and Graph
Self-service Low-code Spatial Studio
Oracle Spatial Studio is a self-service web application that makes it easy for you to create interactive maps and perform spatial analysis on your business data. You can also use Spatial Studio to publish spatial analyses as REST services and generate SQL statements for spatial analysis using low-code and UI components.
With Spatial Studio, you no longer need to write Javascript or SQL or use third-party tools to take advantage of the spatial capabilities in Oracle Database.
Related Topics
Parent topic: Spatial and Graph
Spatial Network Data Model Contraction Hierarchy
Contraction hierarchy, a precomputed in-memory approach, speeds up path computations in the Oracle Spatial and Graph network data model.
By using the contraction hierarchy Java API, you can evaluate shortest path computations, drive time polygon analysis, and traveling salesperson analysis functions more efficiently. These functions will perform 10 to 100 times faster than with previous releases. You can run more network analysis functions using fewer CPUs, and support more concurrent requests using the same hardware.
Related Topics
Parent topic: Spatial and Graph
Spatial Support for Database In-Memory
A new In-Memory Spatial index is created when the
INMEMORY
SPATIAL
clause is specified for
a spatial geometry column during a CREATE
or
ALTER
TABLE
statement.
This means that you no longer have to create and maintain conventional disk-based spatial indexes when spatial tables are stored using Database In-Memory. The in-memory spatial index provides much faster query performance for SDO_FILTER() operations without having to maintain a separate, on-disk spatial index.
Related Topics
Parent topic: Spatial and Graph