SVMベース管理型分類の例

次の例ではSVMベース分類を使用します。基本的には意思決定ツリーの例と同じ手順を使用しています。ただし、次の違いがあります。

この例では、CTX_CLS.TRAINではなくCTX_DDL.CREATE_PREFERENCEを使用して、SVM_CLASSIFIERプリファレンスを設定します。(どちらでも可能です。)
この例では、意思決定ツリーの例のカテゴリ表とは異なり、カテゴリ表にはカテゴリ説明が含まれます。(どちらでも可能です。)
ユーザーにとってルールがわかりにくいため、CTX_CLS.TRAINは、意思決定ツリーの例より少ない引数を取ります。

SVMベース管理型分類を作成するには、次の手順を実行します。

トレーニング・ドキュメント表の作成および移入

create table doc (id number primary key, text varchar2(2000));
insert into doc values(1,'1 2 3 4 5 6');
insert into doc values(2,'3 4 7 8 9 0');
insert into doc values(3,'a b c d e f');
insert into doc values(4,'g h i j k l m n o p q r');
insert into doc values(5,'g h i j k s t u v w x y z');

カテゴリ表の作成および移入

create table testcategory (
        doc_id number, 
        cat_id number, 
        cat_name varchar2(100)
         );
insert into testcategory values (1,1,'number');
insert into testcategory values (2,1,'number');
insert into testcategory values (3,2,'letter');
insert into testcategory values (4,2,'letter');
insert into testcategory values (5,2,'letter');

ドキュメント表のCONTEXT索引の作成

この場合、移入せずに索引を作成します。
```
create index docx on doc(text) indextype is ctxsys.context 
       parameters('nopopulate');
```

SVM_CLASSIFIERの設定

これは、CTX.CLS_TRAINでも可能です。

exec ctx_ddl.create_preference('my_classifier','SVM_CLASSIFIER'); 
exec ctx_ddl.set_attribute('my_classifier','MAX_FEATURES','100');

結果(ルール)表の作成

create table restab (
  cat_id number,
  type number(3) not null,
  rule blob
 );

トレーニングの実行

exec ctx_cls.train('docx', 'id','testcategory','doc_id','cat_id',
     'restab','my_classifier');

ルール表のCTXRULE索引の作成

exec ctx_ddl.create_preference('my_filter','NULL_FILTER');
create index restabx on restab (rule) 
       indextype is ctxsys.ctxrule 
       parameters ('filter my_filter classifier my_classifier');

ここで、2つの未知のドキュメントを分類できます。

select cat_id, match_score(1) from restab 
       where matches(rule, '4 5 6',1)>50;

select cat_id, match_score(1) from restab 
       where matches(rule, 'f h j',1)>50;

drop table doc;
drop table testcategory;
drop table restab;
exec ctx_ddl.drop_preference('my_classifier');
exec ctx_ddl.drop_preference('my_filter');