The Oracle Data Miner graphical user interface (GUI) is based on the GUI for SQL Developer 4.0.
In SQL Developer, click Help and then click SQL Developer Concepts and Usage for an overview of the SQL Developer GUI. The following topics describe the common procedures and menus for the GUI that are specific to data mining and the Oracle Data Miner:
Related Topics
The Oracle Data Miner window generally uses the left side for navigation to find and select objects and the right side to display information about selected objects.
Note:
This text explains the default interface. However, you can customize many aspects of the appearance and behavior of Oracle Data Miner by setting the Oracle Data Miner preferences.
Here is a simple workflow:
About the Data Miner Tab: Positioned in the left pane. You manage database connections here.
Workflow Jobs: Positioned in the left pane. You can view the status for running tasks.
Components: Positioned in the right pane. You select the nodes of the workflow here.
Menu bar: Positioned at the top of the window.
Workflows are displayed in the middle pane of the window.
Properties: You can put this pane any place you want. It is useful to position the Properties pane just below the workflow that you are viewing.
Some settings are sticky settings. If you change a sticky setting, the, the new value becomes the default value.
The menus at the top of the Oracle SQL Developer window contain standard entries, and entries for features specific to Oracle Data Miner.
You can use keyboard shortcuts to access menus and menu items. For example Alt+F for the File menu and Alt+E for the Edit menu, or Alt+H, then Alt+S for Full Text Search of the Help topics. You can also display the File menu by pressing the F10 key.
The icons just below the menus perform a variety of actions, including the following:
New: Opens the New Gallery to define new database objects, such as a new database connection.
Open: Opens a file.
Save: Saves any changes to the currently selected object.
Save All: Saves any changes to all open objects.
Undo: Reverses the last operation.
There are several forms of Undo, including Undo Create to undo the most recent create node, Undo Edit Node to undo the latest edit node, and so on.
Redo: Does the latest undo operation again.
There are several forms of Redo, including Redo Create to redo the most recent create node, Redo Edit Node to redo the latest edit node, and so forth.
Back: Moves to the pane that you most recently visited. Use the drop-down arrow to specify the tab view.
Forward: Moves to the pane after the current one in the list of visited panes. Or use the drop-down arrow to specify a tab view.
These menus contain functionality specific to Oracle Data Miner:
From the View menu, you can access the options related to Data Miner tab, workflow jobs, scheduled workflow jobs, and workflow structure.
The following options are available in the View menu of Oracle Data Miner:
Click Data Miner Connections to open the Data Miner tab.
Click Workflow Jobs to dock the Workflow Jobs pane.
Click Scheduled Jobs to dock the Scheduled Jobs pane.
Click Structure to view the workflow structure in the Workflow Structure pane.
Related Topics
Using the Data Miner menu option , you can open the Data Miner tab, access the Schedule Jobs option, and drop the repository.
The following options are available in the Data Miner option under Tools menu:
Make Visible: Opens the Data Miner tab and Workflow Jobs.
Scheduled Jobs: Opens the Scheduled Jobs dialog box that displays the list of scheduled workflow jobs.
Drop Repository: Drops the Data Miner repository.
These actions are usually performed as part of the Oracle Data Miner installation.
Workflow Jobs displays all running and recently run tasks, arranged according to connection.
A task consists of running of a selected node and any ancestor nodes that must be run before running the node.
The following conditions apply for Workflow Job display:
Workflow Jobs displays the most recent run of a workflow.
When two or more tasks are active, the Workflow Jobs window is automatically displayed.
By default, the Workflow Jobs automatically displays the connection selected in the tab. To view tasks in a different connection, select a connection from the list at the top of the tab.
Workflow jobs specifies for how long a task is displayed.
You an perform multiple tasks from the Workflow Jobs context menu. Right-click a line in the grid or below the lines in the grid.
Related Topics
In the Scheduled Jobs window, you can view and manage scheduled workflow jobs. You can view details such as the workflow name, project name, and the date and time of the next scheduled run.
Cancel schedule
Edit schedule
Open the workflow
View Event Log
The Structure window shows the structure of a workflow or certain model viewers that are currently active.
The nodes in a tree or workflow are listed in a flat list, which does not show parent or child relationships. The links are the keys that tie the nodes together.
When you view nodes and links in the Structure window, the workflow editor reacts by immediately making the selected items visible. This property is useful when you are navigating a complex workflow or tree.
simple
consisting of a Data Source node and a Classification Build node.
The view of the workflow simple
in the Structure window:
If you select an item in the Structure window, then the corresponding item is selected in the workflow. For example, if you go to the Links
folder, and select From "DATA_MINING_BUILD_V" to "Class Build"
, then the link between the two nodes in simple
is highlighted.
These model viewers include a Tree tab:
k-Means and O-Cluster model viewer for the Clustering models
Decision Tree model viewer for the DT Classification model
The Tree tab of the model viewer illustrates how the rules generated by the model are related. These trees are sometime large and complex. You can use the Structure window to navigate among the nodes of the tree, just as you navigate among the nodes of a workflow. In the Structure window, select a node or an item in the Link folder. The node or link in the model viewer is automatically highlighted.
The Structure window supports these controls:
To freeze the Structure window on the current view, click .
A window that is frozen does not track the active selection in the active window.
To open a new instance of the Structure window, click .
The new view appears as a new tab in the Structure window.
From the Tools menu, you can access the options related to Data Miner and Data Miner preferences.
The following options are available in the Tools menu of Oracle Data Miner:
Using the Data Miner menu option , you can open the Data Miner tab, access the Schedule Jobs option, and drop the repository.
The following options are available in the Data Miner option under Tools menu:
Make Visible: Opens the Data Miner tab and Workflow Jobs.
Scheduled Jobs: Opens the Scheduled Jobs dialog box that displays the list of scheduled workflow jobs.
Drop Repository: Drops the Data Miner repository.
These actions are usually performed as part of the Oracle Data Miner installation.
You can set preferences for Oracle Data Miner in the Preference option in the Tools menu.
To set preferences, click Tools and then Preferences. Under Preferences, click Data Miner. Data Miner preferences are divided into several sets:
Node settings specify the behavior of workflow nodes.
You can specify settings related to model build, performance options, and transforms.
Model settings specify properties related to model build, model details, Apply, Text nodes, Test nodes and so on.
Lists the preferences that specify how an Apply node operate.
Automatic Apply Settings Default is either Automatic or Manual. By default, the settings are set to Automatic.
Automatic Data Settings Default is either Automatic, the default, or Manual.
Default Column Order is either Data Columns First or Apply Columns First. Default Columns First is set as the default.
Related Topics
The default values for model build function depends on the mining function.
Specify default values for model build options:
By default, an Association node builds one model using the Apriori algorithm.
The default maximum distinct count for item values is 10. Change the default to a different integer.
A Classification node automatically generates four models by default.
A Classification node uses any of these algorithms to generate the models:
Decision Tree
General Linear Model
Naive Bayes
Support Vector Machine
All four models have the same input data, the same target, and the same case ID (if a case ID is specified).
If you do not want to build models using one of the default algorithms, then deselect that algorithm. You can still add models using the deselected algorithm to a Classification node.
By default, the node generates these test results for tuning:
Performance Metrics
Performance Matrix (Confusion Matrix)
ROC Curve (Binary only)
Lift and Profit. The default is set to the top 5 target values by frequency. You can edit the default setting. By default, the node does not generate selected metrics for Model tuning. You can select the metrics for Model tuning.
You can deselect any of the test results. For example, if you deselect Performance Matrix, then a Performance Matrix is not generated by default.
By default, split data is used for test data. Forty percent of the data is used for testing, and the split data is created as a table. You can change the percentage used for testing and you can create the split data as a view instead of a table. If you create a table, then you can create it in parallel. You can use all of the build data for testing, or you can use a separate test source.
Related Topics
By default, a Clustering node builds two models, one each using O-Cluster and k-Means.
If you are connected to Oracle Database 12c Release 1 (12.1) or later, then a Clustering node also builds an Expectation Maximization model, so that three models are built.
All Clustering models in the node have the same input data and the same Case ID, if one is specified.
If you do not want to build models using one of the algorithms by default, then deselect that algorithm. A user will still be able to add models using the deselected algorithm to a Clustering node.
By default, a Feature Extraction node builds a Nonnegative Matrix Factorization model.
If you are connected to Oracle Database 12c Release 1 (12.1) or later, then the node also builds a Principal Component Analysis model. You can specify that the node builds a Singular Value Decomposition model.
If you do not want to build models using one of the default algorithms, then deselect that algorithm. You can still add models using the deselected algorithm to a Classification node.
By default, a Regression node builds two models, one each using General Linear Model and Support Vector Machine.
All models in the node have the same input data, target, and case ID, if a case ID is specified.
If you do not want to build models using one of the default algorithms, then deselect that algorithm. You can still add models using the deselected algorithm to a Regression node.
By default, two test results, Performance Metrics and Residuals, are calculated.
By default, split data is used for the test data. The split is 40 percent and the split data is created as a view.
By default, a Model Details node uses automatic settings.
You can deselect automatic settings for a specific model details node, or you can deselect automatic settings for all model details nodes.
By default, a Test node uses Automatic Setting.
You can:
Deselect Automatic Settings for a particular Test node.
Deselect Automatic Settings for all Test nodes.
The settings in Text describe how text is handled during a model build.
The default Categorical Cutoff Value is 200
The Default Transformation Type is Token.
Token Transformation Settings use these defaults:
Language: English (default)
Stemming (Deselected)
Stoplist: Default Stoplist
Maximum Number across all documents: 3000
Theme Transformations Settings use these defaults:
Language: English (default)
Stoplist: Default Stoplist
Maximum Number of themes across all Documents: 3000
You can edit the Parallel Processing and In-Memory settings for one or all nodes in the Preferences dialog box.
The Parallel Processing settings and In-Memory settings for all nodes are displayed in the Preferences dialog box, which opens when you click Data Miner under Preferences in the Tools menu.
Click Parallel Settings and select any one of the following options:
Enable: Select one or more nodes and click Enable to enable parallel processing for the selected nodes.
Disable: Select one or more nodes and click Disable to remove the parallel processing setting from the selected nodes.
All: To set parallel processing for all nodes.
None: To remove parallel processing settings from all nodes.
Click In-Memory Settings and select any one of the following options:
Enable: Select one or more nodes and click Enable to enable In-Memory settings for the selected nodes.
Disable: Select one or more nodes and click Disable to remove the In-Memory settings from the selected nodes.
All: To set In-Memory settings for all nodes.
None: To remove In-Memory settings from all nodes.
Click to specify parallel processing and In-Memory settings for a selected node. This opens the Edit Node Performance Settings dialog box.
Related Topics
You can specify preferences that applies to all transformation as well as individual transformations related to Filter Columns, Filter Column Details, join transformation, and sampling.
The preference that applies to all transformations is:
Generate Cache Sample Table to Optimize Viewing: The default setting is to not generate a cache sample table. Generating a sample cache table is useful if you are processing large amounts of data.
For individual transformations, the options are:
The preferences specify the behavior of the Filter Columns Node transformation.
You can specify the following Data Quality criteria:
% Nulls less than or equal: Indicates the largest acceptable percentage of NULL
values in a column of the data source. The default value is 95%.
% Unique less than or equal: Indicates the largest acceptable percentage of values that are unique to a column of the data source. The default value is 95%.
% Constant less than or equal: Indicates the largest acceptable percentage of constant values in a column of the data source.
You can specify the following Attribute Importance settings:
Importance Cutoff: A number between 0 and 1.0. The default cutoff is 0.
Top N: The maximum number of attributes. The default is 100.
Note:
You must select Attribute Importance to generate Attribute Dependency.Sampling (Data Quality and Attribute Importance): Enables Filter Column settings according to the default size for random sample for calculating statistics. The default values for sampling are specified in preferences. You can change the default or even turn off sampling. The default sample size is 10,000 records.
The Column Filter node, by default, uses sampling to determine Data Quality and Attribute Importance. The default is to use a sample size of 2000
records. You can turn off sampling, that is use all of the data, or increase the sample size.
The preferences specify the behavior of Filter Columns Details node transformation.
By default, Automatic Settings is selected.
Related Topics
The preferences specify the behavior of Join node transformation.
The settings are:
Automatic Key Column Deletion:
Automatic (default)
Manual
Default
Automatic Data Column Default:
Automatic (default)
Manual
Default
The preferences specify the behavior of sampling.
There are three preferences for sampling:
Sampling Type: By default, the Sampling Type is Random
with the Seed
Sampling Type can be changed to TopN.
Seed: By default, it is set to 12,345.
You can change the value of Seed.
Sampling Size: This is either the number of rows or the percentage for the sampling size. The default size is either 2000 rows or 60%.
In Viewers, you can specify settings for data viewers and model viewers.
Explore Data Viewer
Graphical Settings
Models
Association Rules
Cluster Tree
Decision Tree
In the Data section, you can specify settings for Explore Data viewer and graphical settings.
The options are:
You can specify precision settings for the Explore Data viewer here. These preferences specify data precision for data viewed in the Explore Data node.
Precision is the maximum number of significant decimal digits, where the most significant digit is the left-most nonzero digit, and the least significant digit is the right-most known digit. Precision is an integer greater than or equal to zero.
The default precision for both Percentage Based Values and Numeric Values is 4.
You can change either or both of these values.
In the Models section, you can specify preferences that apply to all Model Viewers.
These general preferences apply to all Model viewers.
Precision Level Settings: Specify precision:
For Percentage Based Values, the default precision is 4.
For Numerical Values, the precision is 8.
Fetch Size Settings: Specify the number of items fetched. The default fetch size for:
Association Rule Model: 1000
Clustering Rules Model: 20
All Other Models: 1000
There are additional preferences for the tree displays:
Related Topics
You can specify the preferences for the Cluster Tree model display.
Cluster Tree display contains:
Default Node Display: A detailed header
Default Layout: Vertical
There are also settings for Cluster Tree nodes.
You can specify the preferences for the Decision Tree display.
Decision Tree display contains:
Default node Display: Histogram and Detailed Header.
Sort Target Values By: Root Node order. You can also sort by Confidence.
Default Layout: Vertical. You can also choose Horizontal.
There are also settings for Decision Tree .
In the Workflow Editor, you can specify preferences related to nodes, link style, and alternate link routing.
The preferences and their default settings are:
Node Assist: Selected by default. Wizards are automatically displayed when a node is created or connected. For example, if you add a Data Source node to a workflow, then the Data Source Editor is automatically opened. You can deselect this option.
Link Style:
Direct: (Default) In the Direct link style, links are straight lines from one node to another with short segments. Direct link style produces a more compact, direct diagram layout.
Orthogonal: In the Orthogonal link style, links between the nodes are at 90 degrees.
Alternate Link Routing: Deselected by default.
Workflow Directory is the default directory to import workflows from and to export works.
The default value for Workflow Directory is the default for the operating system where Oracle Data Miner is installed. For example, Workflow Directory is My Documents
for Microsoft Windows operating systems.
To change the Workflow Directory, enter the name of the new directory or click Browse to browse for the directory. After you specify the directory name, click OK.
Preferences for the Workflow Jobs specify which connection is displayed and how long the workflow status is displayed
The settings are:
Automatically Display Connection Selected in Navigator: By default, this option is selected. You can deselect it. If you deselect this option, then you must explicitly select a connection.
Don't Display Jobs Older Than 24 Hours: By default, this option is selected. You can change this option.
In the Workflow Scheduler dialog box, you can set email notifications and preferences for workflow jobs and workflow schedules.
Note:
To receive email notifications, you must have an email server set up properly.
In the Notification tab:
Select Enable Email Notification to receive notifications.
In the Recipients field, enter the email addresses to receive notifications.
In the Subject field, enter an appropriate subject.
In the Comments fields, enter comments, if any.
Select one or more events for which you want to receive the notifications:
Started: To receive notifications for all jobs that started.
Succeeded: To receive notifications for all jobs that succeeded.
Failed: To receive notifications for all jobs that failed.
Stopped: To receive notifications for all jobs that stopped.
Click OK.
In the Settings tab:
In the Time Zone field, select a time zone of your preference.
In the Job Priority field, set the priority of the workflow job by placing the pointer between High and Low.
Select Max Failure and set a number as the maximum number of failed workflow execution.
Select Max Run Duration and set the days, hours and minutes for the duration of maximum run time of the workflow job.
Select Schedule Limit and set the days, hours and minutes.
Click OK.
The Diagram menu is available when a workflow is open.
Use the options on this menu to arrange workflow nodes. The Diagram menu has these options:
Use the Connect option to connect two nodes in a workflow.
To connect a node: Select Diagram and click Connect. Link the selected node to another node by drawing a line from the selected node. You can only make valid selections.
To cancel a connection: To cancel a line that is not connected to anything, press Esc. This is the same operation as using the Connect option from the node context menu.
Use the Align option to align a set of elements and normalize the size of elements.
The settings are:
Horizontal Alignment
None: (Default) Performs no horizontal alignment.
Top: Aligns the top edges of the selected elements.
Middle: Aligns the horizontal centers of the selected elements.
Bottom: Aligns the bottom edges of the selected elements.
Vertical Alignment
None: (Default) Performs no vertical alignment.
Left: Aligns the left edges of the selected elements.
Middle: Aligns the vertical centers of the selected elements.
Right: Aligns right edges of the selected elements.
Size Adjustments
Same Width: Changes the width of the selected elements to the average width of all the selected elements.
Same Height: Changes the height of the selected elements to the average height of all the selected elements.
Use the Distribute option to evenly distribute (horizontally and vertically) selected elements in a diagram.
The settings are:
Horizontal Distribution:
Changes the left or right distribution of the selected diagram elements.
Vertical Distribution:
Changes the up or down distribution of the selected diagram objects.
Use the Zoom option for a magnified view.
The default zoom setting is 100%. To return to the default, select 100%.
To zoom in or zoom out of an entire workflow: Click and respectively, or set a specific percentage.
To zoom in on a specific node or nodes: Select the node and then go to Diagram and click Zoom.
To fit the entire workflow in the window: Select Fit to Window.
The online help specific to Oracle Data Miner is in the help folder Oracle Data Miner Concepts and Usage.
To view or search the online help for Oracle Data Miner click Help and then click Table of Content. Then expand the Table of Content and go to Oracle Data Miner Concepts and Usage on the Contents tab of Help Center.
To get help for a specific dialog box, click the Help button or press F1. To get help for objects in a workflow, select the object and press the F1 key.
Online help contains reference topics and the topics that describe how the GUI works. To see reference topics, either expand the help contents in the online help or search in the online help.
Workflow Jobs displays all running and recently run tasks, arranged according to connection.
A task consists of running of a selected node and any ancestor nodes that must be run before running the node.
The following conditions apply for Workflow Job display:
Workflow Jobs displays the most recent run of a workflow.
When two or more tasks are active, the Workflow Jobs window is automatically displayed.
By default, the Workflow Jobs automatically displays the connection selected in the tab. To view tasks in a different connection, select a connection from the list at the top of the tab.
Workflow jobs specifies for how long a task is displayed.
You an perform multiple tasks from the Workflow Jobs context menu. Right-click a line in the grid or below the lines in the grid.
Related Topics
The Workflow Jobs viewer displays the workflows and other details such as the project, workflow status, connections.
To view Workflow Jobs:
Lists the tasks that you can perform with Workflow Jobs.
You can perform the following tasks:
View a particular task: Select the connection in which the task is running. Connections are listed in a drop-down list just above the Workflow Jobs grid.
View and Event Log
Terminate a running workflow: Click .
View a log: Right-click a process in the Workflow Jobs and click View Log.
Related Topics
The Workflow Jobs grid displays the workflow name, project name, and workflow status.
You can also edit the Workflow Jobs preferences here, by clicking the gear icon. The Workflow Jobs grid displays the following:
Workflow name
Project name
Status. For Status, the values are:
ACTIVE: Indicates that the workflow has been executed. Indicated by .
INACTIVE: Indicates that the workflow is idle.
STOPPED: Indicates that the workflow has been stopped.
SCHEDULED: Indicates that the workflow is scheduled to run.
FAILED: Indicates that the workflow execution has failed. Indicated by .
In the Event Log, you can view the log of data mining events.
To view a log of data mining events in the selected connection, right-click an entry in the workflow jobs and select Show Event Log. You can also click at the top of a workflow, just under the tab.
By default, all errors are displayed. You can display errors or informational events. The total number of events and the number of events displayed is at the top of the list. For example, Events: 2 of 90
means that 2 of the 90 events are displayed.
Each error or warning has a message and details associated with it. Select an event. The message and details are displayed in the lower pane of the Event Log window.
For each data mining event in the selected connection, the following are displayed:
Event: In Oracle Data Miner, events indicate the beginning and end of actions, such as START(WORKFLOW)
and END(WORKFLOW).
Each node is processed sequentially.
Job: The name of the job that processes the event. These jobs are internal to Oracle Data Miner.
Node: The name of the node that is being processed. Not all events are associated with a node.
Sub-node: An internal step during node processing. For example, an Anomaly Build node has a sub-node that builds the model.
Time: Start time of the event.
Message: A message about the event. If the event did not encounter problems, then there is no message.
To see more information about the message, including message details, select the event. The message and the details are displayed in the pane below the list of events. Not all events have messages or message details associated with them.
Duration: The amount of time that is spent on processing the event. The duration is displayed for END
events. The duration is displayed in days, hours, minutes and seconds.
All errors are shown. You can select the type of event to display by clicking the icons above the list of events:
To display errors, and events that failed, click .
The default is to display errors only.
To display warnings, click .
To display informational messages, such as the start and end of operations, click .
To refresh, click .
To search for events, click the down arrow next to . You can search by Node
(default) or by Any
to search for anything that is not a node.
To view the context menu, right-click a line in the Workflow Jobs grid or in a blank area of the grid.
The context menu options are:
Go to Workflow: Displays the workflow for this result.
Sort by Most Recent: Sorts the entries.
Preferences: Displays the Workflow Jobs.
Projects reside in a connection. Projects contain all the workflows created in the data mining process.
You must create at least one project in a connection.
Related Topics
This section describes common controls and tasks that you can perform using the Oracle Data Miner GUI.
Use the Filter option to refine your search.
To filter your search:
To display only those items that you are interested in, click . In other words, you can filter out items that you are not interested in, by using this option. There are several filter options and a default option. To select a different filter option, click the triangle beside the binoculars icon.
To clear the search, click .
The SQL Developer Import Data wizard allows you to import data into a database table.
You can create a table and import data into it, or you can import data into an existing table. In other words, you can import delimited data in an operating system file to the database.
To import data:
The Filter option filters out Oracle Data Miner (DR) objects and Oracle Data Mining (DM) objects.
To filter the tables associated with data mining:
You can export charts, graphs, grids, and also Cluster and Decision Tree Rules for use in external documents by copying them to the Microsoft Windows clipboard or saving them to a file.
You can perform the following tasks:
Copy charts to the clipboard or save to a file: To copy charts, right-click the chart and select any one of the following options in the content-menu:
Copy to Clipboard
Save Image As
View Data
Copy data grids to the clipboard: You can copy one or multiple rows in the Graph Data dialog box. To copy data grids:
Select the row that you want to copy in the Graph Data dialog box, while pressing the Ctrl key. To select multiple rows, click the rows while pressing the Ctrl key and Shift keys together.
Then copy the selected rows by pressing the Ctrl + C keys.
To paste the copied rows in the clipboard, press Ctrl + V keys.
Copy and view data content of charts: To copy data content of charts, right-click and select any one of the following options in the content menu:
Copy Image to Clipboard
Save Image As
Copy Cluster and Decision Tree Rules to the clipboard or to a file: To copy Cluster and Decision Tree Rules to a clipboard or a file, use the Save Rules option in the Model viewer.