Oracle Database 12cリリース2(12.2)以降、ore.odmSVD
関数は、Oracle Data Mining特異値分解(SVD)アルゴリズムを使用するモデルを作成します。
特異値分解(SVD)は特徴抽出アルゴリズムです。SVDは、矩形行列を3つの行列('U'、'D'および'V')に分解することで、基礎となるデータの分散を取得する直交線形変換です。行列'D'は対角行列であり、その特異値は、ベースによって取得されるデータ分散の量を反映しています。
例4-23 ore.odmSVD関数の使用方法
IRIS <- ore.push(cbind(Id = seq_along(iris[[1L]]), iris)) svd.mod <- ore.odmSVD(~. -Id, IRIS) summary(svd.mod) d(svd.mod) v(svd.mod) head(predict(svd.mod, IRIS, supplemental.cols = "Id")) svd.pmod <- ore.odmSVD(~. -Id, IRIS, odm.settings = list(odms_partition_columns = "Species")) summary(svd.pmod) d(svd.pmod) v(svd.pmod) head(predict(svd.pmod, IRIS, supplemental.cols = "Id"))
この例のリスト
R> IRIS <- ore.push(cbind(Id = seq_along(iris[[1L]]), iris)) R> R> svd.mod <- ore.odmSVD(~. -Id, IRIS) R> summary(svd.mod) Call: ore.odmSVD(formula = ~. - Id, data = IRIS) Settings: value odms.missing.value.treatment odms.missing.value.auto odms.sampling odms.sampling.disable prep.auto ON scoring.mode scoring.svd u.matrix.output u.matrix.disable d: FEATURE_ID VALUE 1 1 96.2182677 2 2 19.0780817 3 3 7.2270380 4 4 3.1502152 5 5 1.8849634 6 6 1.1474731 7 7 0.5814097 v: ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4' '5' '6' '7' 1 Petal.Length <NA> 0.51162932 0.65943465 -0.004420703 0.05479795 -0.51969015 0.17392232 -0.005674672 2 Petal.Width <NA> 0.16745698 0.32071102 0.146484369 0.46553390 0.72685033 0.31962337 -0.021274748 3 Sepal.Length <NA> 0.74909171 -0.26482593 -0.102057243 -0.49272847 0.31969417 -0.09379235 -0.067308615 4 Sepal.Width <NA> 0.37906736 -0.50824062 0.142810811 0.69139828 -0.25849391 -0.17606099 -0.041908520 5 Species setosa 0.03170407 -0.32247642 0.184499940 -0.12245506 -0.14348647 0.76017824 0.497502783 6 Species versicolor 0.04288799 0.04054823 -0.780684855 0.19827972 0.07363250 -0.12354271 0.571881302 7 Species virginica 0.05018593 0.16796988 0.551546107 -0.07177990 0.08109974 -0.48442099 0.647048040 Warning message: In u.ore.odmSVD(object) : U matrix is not calculated. R> d(svd.mod) FEATURE_ID VALUE 1 1 96.2182677 2 2 19.0780817 3 3 7.2270380 4 4 3.1502152 5 5 1.8849634 6 6 1.1474731 7 7 0.5814097 Warning message: ORE object has no unique key - using random order R> v(svd.mod) ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4' '5' '6' '7' 1 Petal.Length <NA> 0.51162932 0.65943465 -0.004420703 0.05479795 -0.51969015 0.17392232 -0.005674672 2 Petal.Width <NA> 0.16745698 0.32071102 0.146484369 0.46553390 0.72685033 0.31962337 -0.021274748 3 Sepal.Length <NA> 0.74909171 -0.26482593 -0.102057243 -0.49272847 0.31969417 -0.09379235 -0.067308615 4 Sepal.Width <NA> 0.37906736 -0.50824062 0.142810811 0.69139828 -0.25849391 -0.17606099 -0.041908520 5 Species setosa 0.03170407 -0.32247642 0.184499940 -0.12245506 -0.14348647 0.76017824 0.497502783 6 Species versicolor 0.04288799 0.04054823 -0.780684855 0.19827972 0.07363250 -0.12354271 0.571881302 7 Species virginica 0.05018593 0.16796988 0.551546107 -0.07177990 0.08109974 -0.48442099 0.647048040 Warning message: ORE object has no unique key - using random order R> head(predict(svd.mod, IRIS, supplemental.cols = "Id")) Id '1' '2' '3' '4' '5' '6' '7' FEATURE_ID 1 1 0.06161595 -0.1291839 0.02586865 -0.01449182 1.536727e-05 -0.023495349 -0.007998605 2 2 2 0.05808905 -0.1130876 0.01881265 -0.09294788 3.466226e-02 0.069569113 0.051195429 2 3 3 0.05678818 -0.1190959 0.02565027 -0.01950986 8.851560e-04 0.040073030 0.060908867 2 4 4 0.05667915 -0.1081308 0.02496402 -0.02233741 -5.750222e-02 0.093904181 0.077741713 2 5 5 0.06123138 -0.1304597 0.02925687 0.02309694 -3.065834e-02 -0.030664898 -0.003629897 2 6 6 0.06747071 -0.1302726 0.03340671 0.06114966 -9.547838e-03 -0.008210224 -0.081807741 2 R> R> svd.pmod <- ore.odmSVD(~. -Id, IRIS, + odm.settings = list(odms_partition_columns = "Species")) R> summary(svd.pmod) $setosa Call: ore.odmSVD(formula = ~. - Id, data = IRIS, odm.settings = list(odms_partition_columns = "Species")) Settings: value odms.max.partitions 1000 odms.missing.value.treatment odms.missing.value.auto odms.partition.columns "Species" odms.sampling odms.sampling.disable prep.auto ON scoring.mode scoring.svd u.matrix.output u.matrix.disable d: FEATURE_ID VALUE 1 1 44.2872290 2 2 1.5719162 3 3 1.1458732 4 4 0.6836692 v: ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4' 1 Petal.Length <NA> 0.2334487 0.46456598 0.8317440 -0.19463332 2 Petal.Width <NA> 0.0395488 0.04182015 0.1946750 0.97917752 3 Sepal.Length <NA> 0.8010073 0.40303704 -0.4410167 0.03811461 4 Sepal.Width <NA> 0.5498408 -0.78739486 0.2753323 -0.04331888 $versicolor Call: ore.odmSVD(formula = ~. - Id, data = IRIS, odm.settings = list(odms_partition_columns = "Species")) Settings: value odms.max.partitions 1000 odms.missing.value.treatment odms.missing.value.auto R> # xyz R> d(svd.pmod) PARTITION_NAME FEATURE_ID VALUE 1 setosa 1 44.2872290 2 setosa 2 1.5719162 3 setosa 3 1.1458732 4 setosa 4 0.6836692 5 versicolor 1 56.2523412 6 versicolor 2 1.9106625 7 versicolor 3 1.7015929 8 versicolor 4 0.6986103 9 virginica 1 66.2734064 10 virginica 2 2.4318639 11 virginica 3 1.6007740 12 virginica 4 1.2958261 Warning message: ORE object has no unique key - using random order R> v(svd.pmod) PARTITION_NAME ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4' 1 setosa Petal.Length <NA> 0.2334487 0.46456598 0.83174398 -0.19463332 2 setosa Petal.Width <NA> 0.0395488 0.04182015 0.19467497 0.97917752 3 setosa Sepal.Length <NA> 0.8010073 0.40303704 -0.44101672 0.03811461 4 setosa Sepal.Width <NA> 0.5498408 -0.78739486 0.27533228 -0.04331888 5 versicolor Petal.Length <NA> 0.5380908 0.49576111 -0.60174021 -0.32029352 6 versicolor Petal.Width <NA> 0.1676394 0.36693207 -0.03448373 0.91436795 7 versicolor Sepal.Length <NA> 0.7486029 -0.64738491 0.06943054 0.12516311 8 versicolor Sepal.Width <NA> 0.3492119 0.44774385 0.79492074 -0.21372297 9 virginica Petal.Length <NA> 0.5948985 -0.26368708 0.65157671 -0.38988802 10 virginica Petal.Width <NA> 0.2164036 0.59106806 0.42921836 0.64774968 11 virginica Sepal.Length <NA> 0.7058813 -0.27846153 -0.53436210 0.37235450 12 virginica Sepal.Width <NA> 0.3177999 0.70962445 -0.32507927 -0.53829342 Warning message: ORE object has no unique key - using random order R> head(predict(svd.pmod, IRIS, supplemental.cols = "Id")) Id '1' '2' '3' '4' FEATURE_ID 1 1 0.1432539 -0.026487881 -0.071688339 -0.04956008 1 2 2 0.1334289 0.172689424 -0.114854368 -0.02902893 2 3 3 0.1317675 -0.008327214 -0.062409295 -0.02438248 1 4 4 0.1297716 0.075232572 0.097222019 -0.08055912 1 5 5 0.1426868 -0.102219140 -0.009172782 -0.06147133 1 6 6 0.1554060 -0.055950655 0.160698708 0.14286095 3