4.2.13 Building a Singular Value Decomposition Model

Beginning in Oracle Database 12c, Release 2 (12.2), the ore.odmSVD function creates a model that uses the Oracle Data Mining Singular Value Decomposition (SVD) algorithm.

Singular Value Decomposition (SVD) is a feature extraction algorithm. SVD is orthogonal linear transformations that capture the underlying variance of the data by decomposing a rectangular matrix into three matrixes: 'U', 'D', and 'V'. Matrix 'D' is a diagonal matrix and its singular values reflect the amount of data variance captured by the bases.

Example 4-23 Using the ore.odmSVD Function

IRIS <- ore.push(cbind(Id = seq_along(iris[[1L]]), iris))

svd.mod <- ore.odmSVD(~. -Id, IRIS)
summary(svd.mod)
d(svd.mod)
v(svd.mod)
head(predict(svd.mod, IRIS, supplemental.cols = "Id"))

svd.pmod <- ore.odmSVD(~. -Id, IRIS, 
                             odm.settings = list(odms_partition_columns = "Species"))
summary(svd.pmod)
d(svd.pmod)
v(svd.pmod)
head(predict(svd.pmod, IRIS, supplemental.cols = "Id"))

Listing for This Example

R> IRIS <- ore.push(cbind(Id = seq_along(iris[[1L]]), iris))
R> 
R> svd.mod <- ore.odmSVD(~. -Id, IRIS)
R> summary(svd.mod)
Call:
ore.odmSVD(formula = ~. - Id, data = IRIS)

Settings: 
                                               value
odms.missing.value.treatment odms.missing.value.auto
odms.sampling                  odms.sampling.disable
prep.auto                                         ON
scoring.mode                             scoring.svd
u.matrix.output                     u.matrix.disable

d: 
  FEATURE_ID      VALUE
1          1 96.2182677
2          2 19.0780817
3          3  7.2270380
4          4  3.1502152
5          5  1.8849634
6          6  1.1474731
7          7  0.5814097
v: 
  ATTRIBUTE_NAME ATTRIBUTE_VALUE        '1'         '2'          '3'         '4'         '5'         '6'          '7'
1   Petal.Length            <NA> 0.51162932  0.65943465 -0.004420703  0.05479795 -0.51969015  0.17392232 -0.005674672
2    Petal.Width            <NA> 0.16745698  0.32071102  0.146484369  0.46553390  0.72685033  0.31962337 -0.021274748
3   Sepal.Length            <NA> 0.74909171 -0.26482593 -0.102057243 -0.49272847  0.31969417 -0.09379235 -0.067308615
4    Sepal.Width            <NA> 0.37906736 -0.50824062  0.142810811  0.69139828 -0.25849391 -0.17606099 -0.041908520
5        Species          setosa 0.03170407 -0.32247642  0.184499940 -0.12245506 -0.14348647  0.76017824  0.497502783
6        Species      versicolor 0.04288799  0.04054823 -0.780684855  0.19827972  0.07363250 -0.12354271  0.571881302
7        Species       virginica 0.05018593  0.16796988  0.551546107 -0.07177990  0.08109974 -0.48442099  0.647048040
Warning message:
In u.ore.odmSVD(object) : U matrix is not calculated.
R> d(svd.mod)
  FEATURE_ID      VALUE
1          1 96.2182677
2          2 19.0780817
3          3  7.2270380
4          4  3.1502152
5          5  1.8849634
6          6  1.1474731
7          7  0.5814097
Warning message:
ORE object has no unique key - using random order 
R> v(svd.mod)
  ATTRIBUTE_NAME ATTRIBUTE_VALUE        '1'         '2'          '3'         '4'         '5'         '6'          '7'
1   Petal.Length            <NA> 0.51162932  0.65943465 -0.004420703  0.05479795 -0.51969015  0.17392232 -0.005674672
2    Petal.Width            <NA> 0.16745698  0.32071102  0.146484369  0.46553390  0.72685033  0.31962337 -0.021274748
3   Sepal.Length            <NA> 0.74909171 -0.26482593 -0.102057243 -0.49272847  0.31969417 -0.09379235 -0.067308615
4    Sepal.Width            <NA> 0.37906736 -0.50824062  0.142810811  0.69139828 -0.25849391 -0.17606099 -0.041908520
5        Species          setosa 0.03170407 -0.32247642  0.184499940 -0.12245506 -0.14348647  0.76017824  0.497502783
6        Species      versicolor 0.04288799  0.04054823 -0.780684855  0.19827972  0.07363250 -0.12354271  0.571881302
7        Species       virginica 0.05018593  0.16796988  0.551546107 -0.07177990  0.08109974 -0.48442099  0.647048040
Warning message:
ORE object has no unique key - using random order 
R> head(predict(svd.mod, IRIS, supplemental.cols = "Id"))
  Id        '1'        '2'        '3'         '4'           '5'          '6'          '7' FEATURE_ID
1  1 0.06161595 -0.1291839 0.02586865 -0.01449182  1.536727e-05 -0.023495349 -0.007998605          2
2  2 0.05808905 -0.1130876 0.01881265 -0.09294788  3.466226e-02  0.069569113  0.051195429          2
3  3 0.05678818 -0.1190959 0.02565027 -0.01950986  8.851560e-04  0.040073030  0.060908867          2
4  4 0.05667915 -0.1081308 0.02496402 -0.02233741 -5.750222e-02  0.093904181  0.077741713          2
5  5 0.06123138 -0.1304597 0.02925687  0.02309694 -3.065834e-02 -0.030664898 -0.003629897          2
6  6 0.06747071 -0.1302726 0.03340671  0.06114966 -9.547838e-03 -0.008210224 -0.081807741          2
R> 
R> svd.pmod <- ore.odmSVD(~. -Id, IRIS, 
+                         odm.settings = list(odms_partition_columns = "Species"))
R> summary(svd.pmod)
$setosa

Call:
ore.odmSVD(formula = ~. - Id, data = IRIS, odm.settings = list(odms_partition_columns = "Species"))

Settings: 
                                               value
odms.max.partitions                             1000
odms.missing.value.treatment odms.missing.value.auto
odms.partition.columns                     "Species"
odms.sampling                  odms.sampling.disable
prep.auto                                         ON
scoring.mode                             scoring.svd
u.matrix.output                     u.matrix.disable

d: 
  FEATURE_ID      VALUE
1          1 44.2872290
2          2  1.5719162
3          3  1.1458732
4          4  0.6836692
v: 
  ATTRIBUTE_NAME ATTRIBUTE_VALUE       '1'         '2'        '3'         '4'
1   Petal.Length            <NA> 0.2334487  0.46456598  0.8317440 -0.19463332
2    Petal.Width            <NA> 0.0395488  0.04182015  0.1946750  0.97917752
3   Sepal.Length            <NA> 0.8010073  0.40303704 -0.4410167  0.03811461
4    Sepal.Width            <NA> 0.5498408 -0.78739486  0.2753323 -0.04331888

$versicolor

Call:
ore.odmSVD(formula = ~. - Id, data = IRIS, odm.settings = list(odms_partition_columns = "Species"))

Settings: 
                                               value
odms.max.partitions                             1000
odms.missing.value.treatment odms.missing.value.auto
R> # xyz
R> d(svd.pmod)
   PARTITION_NAME FEATURE_ID      VALUE
1          setosa          1 44.2872290
2          setosa          2  1.5719162
3          setosa          3  1.1458732
4          setosa          4  0.6836692
5      versicolor          1 56.2523412
6      versicolor          2  1.9106625
7      versicolor          3  1.7015929
8      versicolor          4  0.6986103
9       virginica          1 66.2734064
10      virginica          2  2.4318639
11      virginica          3  1.6007740
12      virginica          4  1.2958261
Warning message:
ORE object has no unique key - using random order 
R> v(svd.pmod)
   PARTITION_NAME ATTRIBUTE_NAME ATTRIBUTE_VALUE       '1'         '2'         '3'         '4'
1          setosa   Petal.Length            <NA> 0.2334487  0.46456598  0.83174398 -0.19463332
2          setosa    Petal.Width            <NA> 0.0395488  0.04182015  0.19467497  0.97917752
3          setosa   Sepal.Length            <NA> 0.8010073  0.40303704 -0.44101672  0.03811461
4          setosa    Sepal.Width            <NA> 0.5498408 -0.78739486  0.27533228 -0.04331888
5      versicolor   Petal.Length            <NA> 0.5380908  0.49576111 -0.60174021 -0.32029352
6      versicolor    Petal.Width            <NA> 0.1676394  0.36693207 -0.03448373  0.91436795
7      versicolor   Sepal.Length            <NA> 0.7486029 -0.64738491  0.06943054  0.12516311
8      versicolor    Sepal.Width            <NA> 0.3492119  0.44774385  0.79492074 -0.21372297
9       virginica   Petal.Length            <NA> 0.5948985 -0.26368708  0.65157671 -0.38988802
10      virginica    Petal.Width            <NA> 0.2164036  0.59106806  0.42921836  0.64774968
11      virginica   Sepal.Length            <NA> 0.7058813 -0.27846153 -0.53436210  0.37235450
12      virginica    Sepal.Width            <NA> 0.3177999  0.70962445 -0.32507927 -0.53829342
Warning message:
ORE object has no unique key - using random order 
R> head(predict(svd.pmod, IRIS, supplemental.cols = "Id"))
  Id       '1'          '2'          '3'         '4' FEATURE_ID
1  1 0.1432539 -0.026487881 -0.071688339 -0.04956008          1
2  2 0.1334289  0.172689424 -0.114854368 -0.02902893          2
3  3 0.1317675 -0.008327214 -0.062409295 -0.02438248          1
4  4 0.1297716  0.075232572  0.097222019 -0.08055912          1
5  5 0.1426868 -0.102219140 -0.009172782 -0.06147133          1
6  6 0.1554060 -0.055950655  0.160698708  0.14286095          3