7.4 Creating a Model that Includes Text Mining
Learn how to create a model that includes text mining.
Oracle Data Mining supports unstructured text within columns of VARCHAR2, CHAR, CLOB, BLOB, and BFILE, as described in the following table:
                  
Table 7-2 Column Data Types That May Contain Unstructured Text
| Data Type | Description | 
|---|---|
| 
 | Oracle Data Mining interprets  | 
| 
 | Oracle Data Mining interprets  | 
| 
 | Oracle Data Mining interprets  | 
| 
 | Oracle Data Mining interprets  Oracle Data Mining interprets  | 
The settings described in the following table control the term extraction process for text attributes in a model. Instructions for specifying model settings are in "Specifying Model Settings".
Table 7-3 Model Settings for Text
| Setting Name | Data Type | Setting Value | Description | 
|---|---|---|---|
| 
 | 
 | Name of an Oracle Text policy object created with  | Affects how individual tokens are extracted from unstructured text. See "Creating a Text Policy". | 
| 
 | 
 | 1 <= value <= 100000 | Maximum number of features to use from the document set (across all documents of each text column) passed to  Default is 3000. | 
A model can include one or more text attributes. A model with text attributes can also include categorical and numerical attributes.
To create a model that includes text attributes:
- 
                        Create an Oracle Text policy object.. 
- 
                        Specify the model configuration settings that are described in "Table 7-3". 
- 
                        Specify which columns must be treated as text and, optionally, provide text transformation instructions for individual attributes. 
- 
                        Pass the model settings and text transformation instructions to DBMS_DATA_MINING.CREATE_MODEL.Note: All algorithms except O-Cluster can support columns of unstructured text. The use of unstructured text is not recommended for association rules (Apriori).