Oracle�
Fusion Middleware Oracle WebCenter Forms Recognition Verifier User's Guide 12c Release 1 (12.2.1.4.0) E93589-02 September 2019 Documentation for WebCenter Forms Recognition Verifier,
that describes how to configure and use the application to verify
documents, and to perform Supervised Learning Workflow.
Oracle Fusion Middleware Oracle WebCenter Forms Recognition Verifier User's Guide, 12c Release 1 (12.2.1.4.0)
E93589-02
Copyright � 2009, 2019, Oracle and/or its affiliates. All rights reserved.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.
Contents
1���� What Is Verifier?.................................................................................................................................................. 7
1.1������� What Happens When Verifier Process Documents?................................................................................. 7
1.1.1������� Verifier Key Features:.................................................................................................................................. 8
1.1.2������� Verifier Functionality
Highlights................................................................................................................. 8
1.2������� Some Helpful Terms....................................................................................................................................... 8
2���� About
WebCenter Forms Recognition Workflow................................................................................ 11
2.1������� Start and Exit Verifier................................................................................................................................... 11
2.2������� About Logging into Verifier........................................................................................................................ 11
2.3������� About Specifying Login Information with Command Line
Arguments.............................................. 12
3���� Roles������������ 13
3.1������� Change Your Password................................................................................................................................ 13
4���� About
Configuring Verifier.......................................................................................................................... 14
4.1������� Configure Verifier......................................................................................................................................... 14
4.2������� About the General Settings.......................................................................................................................... 14
4.2.1������� About Specifying the
Project File............................................................................................................... 14
4.2.2������� About Specifying the
Directories............................................................................................................... 15
4.2.3������� About Specifying
Client Settings............................................................................................................... 15
4.2.4������� About Specifying the
Batch Options.......................................................................................................... 15
4.2.5������� About Fields Edit Mode............................................................................................................................ 16
4.2.6������� About Specifying
Tabbing Behavior.......................................................................................................... 16
4.2.7������� Enable 508 Compliance.............................................................................................................................. 16
4.3������� About the workflow settings....................................................................................................................... 16
4.4������� About Exception Handling Settings.......................................................................................................... 17
4.4.1������� Select States............................................................................................................................................... 17
4.4.2������� Configuring Exception
Handling.............................................................................................................. 17
4.5������� Quality Assurance with WebCenter Forms Recognition........................................................................ 18
4.6������� Configure Supervised Learning................................................................................................................. 19
4.6.1������� Supervised Learning
Settings.................................................................................................................... 20
4.7������� Advanced Settings........................................................................................................................................ 21
4.8������� About Batch Filtering................................................................................................................................... 21
4.8.1������� Configure Batch Filter
Conditions............................................................................................................. 21
5���� Configure
Tasks to Perform at the Workstation.................................................................................... 23
5.1.1������� Configure the input
and output states....................................................................................................... 23
5.1.1.1������ About Verification Rules..................................................................................................................... 23
6���� Getting
Familiar with the User Interface................................................................................................. 25
6.1������� Show Batch List............................................................................................................................................. 25
6.1.1������� Batch List Keyboard
Shortcuts................................................................................................................... 25
6.1.2������� Batch List Toolbar
Buttons......................................................................................................................... 25
6.1.3������� Navigation Toolbar
Buttons...................................................................................................................... 26
6.1.4������� Batch List Icons.......................................................................................................................................... 26
6.1.4.1������ Batch List Columns............................................................................................................................. 26
6.1.4.2������ Sort in Batch List................................................................................................................................. 27
6.1.4.3������ Select a Batch in Batch List.................................................................................................................. 27
6.1.4.4������ Leave the Batch List............................................................................................................................ 28
6.2������� About the Document List............................................................................................................................ 28
6.2.1������� Open Document List................................................................................................................................. 28
6.2.2������� Document List Keyboard
Shortcuts........................................................................................................... 28
6.2.3������� Main Toolbar Buttons................................................................................................................................ 29
6.2.4������� About the Batch
Structure Area................................................................................................................. 29
6.2.5������� Navigate in the
Document List.................................................................................................................. 30
6.2.5.1������ About Splitting and Appending Documents....................................................................................... 30
6.2.5.2������ Split a Multipage Document............................................................................................................... 30
6.2.5.3������ Append Two Documents.................................................................................................................... 31
6.2.6������� Viewer Toolbar Buttons............................................................................................................................. 31
6.2.7������� About the Document
Area......................................................................................................................... 31
6.2.8������� Print a Document...................................................................................................................................... 32
6.2.8.1������ Printing Verified Data Content............................................................................................................ 32
6.2.9������� About the Verification
View Classification Window................................................................................. 32
6.2.9.1������ Open the Verification View Classification Window............................................................................ 32
6.2.9.2������ Classification Window Keyboard Shortcuts........................................................................................ 32
6.2.9.3������ Verification View Classification Window Toolbar
Buttons................................................................. 33
6.2.9.4������ About the Class Selection List............................................................................................................. 34
6.2.9.5������ Set or Change a Classification Result.................................................................................................. 35
6.2.9.6������ Select a Class Result Using Advanced Classification........................................................................... 35
6.2.10����� About the Verification
View Indexing Window........................................................................................ 35
6.2.10.1���� Open the Verification View Indexing Window................................................................................... 35
6.2.10.2���� Increase or Decrease the Image Area................................................................................................... 35
6.2.10.3���� Indexing Window Keyboard Shortcuts............................................................................................... 35
6.2.10.4���� Verification View Indexing Window Toolbar Buttons......................................................................... 36
6.2.10.5���� Support for the Mouse Wheel............................................................................................................. 37
6.2.10.6���� Form Area........................................................................................................................................... 37
6.2.10.7���� Field Area............................................................................................................................................ 38
6.2.10.8���� Navigate the
Field Area...................................................................................................................... 38
6.2.10.9���� About the Verification View Document Area...................................................................................... 38
6.2.10.10������� Navigate the Document Area......................................................................................................... 39
6.2.10.11������� About Tables in the Document Area............................................................................................... 39
6.2.10.12������� About the Current Input Area........................................................................................................ 39
6.2.10.13������� About the User Information Area................................................................................................... 39
7���� About
Working with Verifier....................................................................................................................... 40
7.1������� Manual Correction of Automatic Page Separation.................................................................................. 40
7.2������� About Manual Correction of Classification Results................................................................................ 41
7.2.1������� Manually Correct
Classification Results.................................................................................................... 41
7.3������� About Processing Documents with an Obsolete Class............................................................................ 41
7.4������� About Manual Correction of Extraction Results...................................................................................... 42
7.4.1������� Correct Invalid
Results.............................................................................................................................. 42
7.4.1.1������ Form Elements and Field Types.......................................................................................................... 42
7.4.1.2������ About Editing Text Fields................................................................................................................... 43
7.4.1.3������ About Auto-Completion..................................................................................................................... 43
7.4.1.4������ About Inserting Words in Fields......................................................................................................... 43
7.4.1.5������ Use a Word that is a Candidate for a Field.......................................................................................... 44
7.4.1.6������ Use a Word That Is Not a Candidate for a Field.................................................................................. 44
7.4.1.7������ Insert Blocks of Text............................................................................................................................ 45
7.4.2������� Finish the Validation................................................................................................................................. 45
7.5������� About Manual Correction of Classification and
Extraction Results..................................................... 45
7.6������� About Smart Indexing.................................................................................................................................. 46
7.6.1������� Use Smart Indexing................................................................................................................................... 46
7.7������� Check Entire Batches.................................................................................................................................... 46
8���� About
Working with Tables......................................................................................................................... 48
8.1������� About Manual Training and Correction Methods................................................................................... 48
8.1.1������� About Auto-Completion
with Tables........................................................................................................ 48
8.1.2������� Insert Candidate Words
in Table Cells...................................................................................................... 48
8.1.3������� Insert Non-Candidate
Words in Table Cells.............................................................................................. 48
8.1.4������� Correcting Table
Structure......................................................................................................................... 48
8.1.5������� About the
Rubber-Banding Feature........................................................................................................... 49
8.1.5.1������ Auto-Scroll with the Rubber-Banding Feature.................................................................................... 49
8.1.5.2������ Add Column Data to a Table............................................................................................................... 49
8.1.5.3������ Insert Column Data............................................................................................................................. 49
8.1.5.4������ Replace Column Data.......................................................................................................................... 50
8.1.5.5������ Use the Rubber-Banding Feature........................................................................................................ 50
8.1.5.6������ Rubber-Banding Use Case One........................................................................................................... 50
8.1.5.7������ Rubber-Banding Use Case Two........................................................................................................... 50
8.1.5.8������ Rubber-Banding Use Case Three......................................................................................................... 50
8.1.5.9������ Rubber-Banding Use Case Four.......................................................................................................... 51
8.1.6������� About Correcting
Single Cells................................................................................................................... 51
8.1.6.1������ Delete an Unnecessary Cell................................................................................................................. 51
8.1.6.2������ Insert a Cell......................................................................................................................................... 51
8.2������� About Table Extraction and Correction..................................................................................................... 51
8.2.1������� About Learning Lines................................................................................................................................ 51
8.2.2������� About Learning Column
Mappings........................................................................................................... 52
8.2.3������� About Correcting
Fields in Tables Created with Brainware Table Extraction............................................ 53
8.2.4������� Use the Standard
Method for Table Extraction.......................................................................................... 53
8.2.4.1������ Step 1: Show the First Row Sample..................................................................................................... 53
8.2.4.2������ Step 2: Learn Mapping in the Row You Learned................................................................................. 54
8.2.4.3������ Step 3: Learn Missing Lines................................................................................................................. 54
8.2.4.4������ Step 4: Learn and Adjust the Mapping of Missing or
Wrong Columns............................................... 55
8.2.4.5������ Step 5: Manually Correct the Table Data and Validate
the Table......................................................... 55
8.2.4.6������ Step 6: Learn the Document................................................................................................................ 55
8.2.5������� Advanced Learning with
Brainware Table Extraction............................................................................... 55
8.2.6������� Advanced Learning:
Additional Functions................................................................................................ 56
8.2.6.1������ About the Unmap Column Method.................................................................................................... 56
8.2.6.2������ Undo Column Mapping...................................................................................................................... 56
8.2.6.3������ Learn a Block of Secondary Lines........................................................................................................ 56
8.2.6.4������ Unlearn a Line..................................................................................................................................... 56
8.2.6.5������ Learn a Line as a Wrong Line.............................................................................................................. 57
9���� Working
with Learn Set Manager............................................................................................................... 58
9.1������� About Supervised Learning........................................................................................................................ 58
9.2������� Start Learn Set Manager............................................................................................................................... 58
9.3������� Getting Familiar with the Learn Set Manager User
Interface................................................................ 58
9.3.1������� Accumulated Documents
Browsing.......................................................................................................... 58
9.3.2������� Global Learnset
Browsing......................................................................................................................... 59
9.3.3������� Learn Set Manager
Keyboard Shortcuts.................................................................................................... 59
9.3.4������� Learn Set Manager
Toolbar Buttons.......................................................................................................... 59
9.3.5������� Learn Set Manager
Viewer Toolbar Buttons.............................................................................................. 60
9.4������� Using Learn Set Manager............................................................................................................................ 60
9.4.1������� About the Learn Set
Manager Process....................................................................................................... 60
9.4.2������� Configure LearnSet
Manager..................................................................................................................... 60
9.4.3������� Work with Common
Learnsets.................................................................................................................. 61
9.4.3.1������ Correct Tables..................................................................................................................................... 62
9.4.3.2������ Reclassify a Document........................................................................................................................ 62
9.4.3.3������ Accept and Reject Documents............................................................................................................. 62
9.4.4������� About Sorting by
Vendor and Other Sorting Extensions in Learn Set Manager......................................... 62
9.4.5������� Work with Global
Learnsets...................................................................................................................... 63
9.4.6������� Train Base Classes..................................................................................................................................... 63
9.4.7������� Update Local Projects................................................................................................................................ 63
9.5������� About Using Learn Set Manager on Several Workstations.................................................................... 64
9.5.1������� About Batch-Level
Locking....................................................................................................................... 64
9.5.2������� About Tracking Changes
Made by Supervised Learning Managers.......................................................... 64
9.5.3������� About Project File and
Learnset Locking During Training........................................................................ 65
10�� Frequently
Asked Questions........................................................................................................................ 66
WebCenter Forms Recognition is a product suite designed for automatically processing incoming documents. WebCenter Forms Recognition can process documents from arbitrary physical sources and paper-based documents as well as from electronic files.
Structured or unstructured document input is obtained by scanning paper-based documents or as files. All documents are stored on a computer�s hard drive. WebCenter Forms Recognition monitors specified directories on this hard drive for new documents. If new documents are detected, WebCenter Forms Recognition imports them.
Imported documents are first analyzed to determine the document layout and to recognize structures such as words, lines, logos, or tables.
The documents are then classified according to predefined categories. Examples of typical categories used in classification are invoices, orders, offers, or resumes. Categories can be defined individually, depending on the needs of your organization. Using a set of sample documents, WebCenter Forms Recognition learns to distinguish which category a previously unknown document belongs to.
For each category, the data relevant for further processing is different. For example, if you are processing invoices, you probably want to know the total sum to be paid. This information is irrelevant if you are processing resumes, where the applicant�s name, the desired position, and the contact options are more important. WebCenter Forms Recognition identifies and extracts data that is relevant for the respective document category. The data that is to be extracted can be defined individually to suit the needs of your organization.
Finally, the documents, their category assignments, and the extracted information are released from WebCenter Forms Recognition and written to designated export directories. The documents are then forwarded to connected systems. For example, invoices can automatically be forwarded to the software system used in your company�s accounting department, while resumes are sent to Human Resources.
All this is done without human intervention once the WebCenter Forms Recognition application has been set up. But what happens if WebCenter Forms Recognition cannot properly process a document? There are several reasons this could happen, for example:
� Paper-based documents might be unclear, so that WebCenter Forms Recognition is not able to read them.
� There might be stamps or notes on the documents that make important sections illegible for WebCenter Forms Recognition.
� WebCenter Forms Recognition may encounter a document from an unknown category. Since the software was not previously trained to recognize documents from this category, it will not be able to process the document.
� WebCenter Forms Recognition may have been defined to extract information that is missing, such as a form that was not filled in correctly.
That is where Verifier comes in. Verifier is the quality assurance utility of the WebCenter Forms Recognition suite. The application detects all documents with processing problems and presents them to the operator for verification.
Since the verification step is done before the export step, only qualified output will leave the WebCenter Forms Recognition process. Therefore, subsequent systems will only receive appropriate input.
The WebCenter Forms Recognition database platform enables you to keep a central store of your project and authentication information. This solution also allows for central management of storage and backup and thus provides for easier security, better connectivity of your applications, and higher flexibility for your personnel.
� Allows central WebCenter Forms Recognition database storage of projects and of user authentication information for more flexible access.
� Allows convenient correction of automatic classification results.
� Allows convenient correction of automatic extraction results.
� Allows manual indexing of documents.
� Allows semi-automatic indexing of documents by means of database lookups.
� Allows a final check of corrected documents before release.
� Sophisticated status management and filter techniques show you only the documents you have to check and nothing else.
� During the application design, the user interface can be configured, providing optimum display options for each document category.
� Keyboard shortcuts are available for most operations.
� Through automatic locking, document batches can be processed by teams of operators.
This section
provides some helpful terms when using Verifier.
Batch |
A batch is just a stack of documents. Usually, this stack
is not sorted. In the context of WebCenter Forms Recognition, batches consist of electronic
documents. The documents inside such a batch may be paper-based documents
that have been scanned to transform them into a digital format, or files
created using applications such as a word processor. Various documents are
normally assigned to the same batch only because they have been received
within the same time period. For example, all letters received in the morning
may be scanned until noon and therefore end up in the same batch. |
Folder |
In a business environment, folders are normally used to
keep several documents together. WebCenter Forms Recognition does the same thing with
folders. However, in the context of WebCenter Forms Recognition, a folder is always a
structure inside a batch. This means that batches can either consist of
document stacks, or they can consist of stacks of folders. |
Document |
A document is a piece of information that can serve as
evidence of an event, situation, or business transaction. For example, a
packing slip may provide evidence that an order has actually been shipped.
Since people are used to working with paper, electronic documents strongly
resemble paper-based documents. You will notice that WebCenter Forms Recognition documents consist of one or
several pages, though the concept of a page is not really required for
digital documents. |
Classification |
Classification means taking an unsorted stack of
documents and organizing them into smaller stacks so that each stack contains
only documents belonging to the same category. In other words, you start with
a mess and end up with an organized stack of invoices, a second stack of
resumes, a third stack of orders, and so on. Class and category are the same
thing. |
Indexing |
Imagine you have a homogeneous stack of invoices, and you
start to write out the information contained in the documents. For each
document in the stack, for example, invoices, you will note the name of the
supplier, the total sum to be paid, and the invoice number. This procedure is
called indexing, and the
information that was noted is the indexing information. Once you have
finished, you file the invoices and use the indexing information to build
your filing structure. Later, you will be able to search and identify the
document with the help of the indexing information. In the context of WebCenter Forms Recognition, indexing information is
applied to a set of fields associated with the document. For each document
category, a different set of fields can be used. |
Extraction |
If you take the stack of invoices and again write out the
name of the supplier, the total sum to be paid, and the invoice number, but
this time it is done automatically, the procedure is called extraction.
Extraction is a means for automatic document indexing. Extraction is
context-sensitive; that is, the extracted information depends on the document
category. |
State |
A state is a number that tells you how far the processing
of a document has progressed. If the entire procedure of document processing
consists of single steps, then the state increases with each step that has
been completed. The state also indicates whether a step has been completed
successfully, or whether there have been problems. In WebCenter Forms Recognition, states are determined hierarchically
from the bottom up. If anything is wrong with a document, then there is also
something wrong with the batch it belongs to. |
Verification |
Verification is a task related to quality assurance. It
involves taking a document that has been processed or partially processed,
checking the processing results, and correcting any errors. |
Validation |
Validation is another task related to quality assurance.
Validation means confirming that a processing result is correct. This can be
done at several levels: � For the class or a field
associated to a document. � For the document as a whole. � For an entire batch. |
Learnset |
In classification, a learnset is a group of documents
whose classification is specified by a user. For each view and each class,
the user must provide a sufficient number of representative documents.
Similarly, in extraction, a learnset is a set of documents whose field
contents are selected by the user from a set of candidates. |
In WebCenter
Forms Recognition, the flow of incoming documents follows a
sequence of standard processing steps. One of the objectives of WebCenter
Forms Recognition
is to get documents to their recipients as quickly as possible.
Automatic steps are
executed by the Runtime Server and include document
import with batch creation, OCR and layout analysis, classification,
extraction, export, and clean-up. These automatic steps are completed with two
manual verification steps that ensure only high-quality output is produced:
�
Verification
of the classification step.
�
Verification
of the extraction step.
If the Runtime
Server has completed an automatic step and the batch contains
only valid results, the next automatic step can be accomplished without human
intervention.
However, if the Runtime Server detects that the batch contains invalid results, the user can manually analyze and resolve the problem. Invalid batches are presented to you in a task list, called the Batch List. Finally, when WebCenter Forms Recognition has finished processing a batch, the documents are sent to their recipients.
If Verifier was installed as recommended,
you can launch it from the Windows Start menu as follows:
Start All
Programs Oracle WebCenter
Forms Recognition WebCenter
Forms Recognition
Verifier
After startup and login, the application displays the Batch List.
To quit Verifier,
select Exit from the File menu.
When you log in to
an existing project in Verifier, you must supply your user
name and password. This password is not necessarily the same as the one you use
to log in to your workstation. Instead, it is specific to Verifier,
and possibly to the project. However, you probably have the same user name and
password for all Verifier projects you work on. Your
user name and password were assigned to you in Designer
when your project administrator configured the project.
Your project
administrator can give you the option to remember your user name and password
between logons. This has been enabled if the Remember Password checkbox
appears on the logon form. To remember your user name and password between
logons, fill in your user name and password and check Remember Password before clicking OK. Next time you logon to the same computer, the system will fill
in the user name and password automatically so that simply clicking OK will log you in.
When launching Verifier
for the first time, the application is not yet configured. The Batch List will be empty and an error
message displays. Verifier needs to be configured. This
should be done by an experienced user.
To suppress project
authentication when starting Verifier, you can specify logon
information as command line arguments. The command line argument for user name
is /USR, and for password it is /PWD.
For example, the
following line in a Windows batch file placed in the WebCenter
Forms Recognition program folder launches Verifier
under John Smith�s account:
start /B DstVer.exe /USR "John Smith" /PWD john1234567
You can use the same mechanism from the Windows Run menu, for example:
"C:\Program Files (x86)\Oracle\WebCenter Forms Recognition\Bin\bin\DstVer .exe" /USR "John Smith" /PWD john1234567
If the password is empty, there is no need to specify the /PWD option. For example:
start /B DstVer.exe /USR "Guest User"
The administrator can also review who has logged into the application by entering a certain script. Refer to the Oracle WebCenter Forms Recognition Scripting User�s Guide for more information.
Depending on the assigned
role, Verifier users are able to complete the following tasks:
� Define, modify, and maintain the learn set.
� Collect and manage local training data.
� Propose learn set candidates to improve the
performance.
� Verify the documents that Runtime Server
could not automatically process.
� Access and change the batch filtering
properties.
� Access and change the settings.
Refer to the Users, groups, and roles in the WebCenter Forms Recognition Designer User Guide for more information.
To change your
password, complete the following steps:
1.
Load
Project in Designer application.
2. From
the Options menu, select Change Password� The Properties dialog box is displayed.
3. Enter
your existing password in the Old
Password field.
4. Enter
a new password in the New Password
field.
5. Enter
the new password again in the Confirm
New Password field.
6. Optional.
Select the Decode Password option,
which allows you to see the password you entered when changing password.
If Decode Password is unchecked, when you
type the password, it is masked with asterisks.
7. Optional.
Select Update Password in Database,
which is only available if the project administrator has enabled database
authentication for the project.
8. Click
OK.
You can only change
the Verifier settings if you have been assigned the Verifier Settings role.
Configuring Verifier
entails specifying which batches of documents are processed at a given workstation.
This includes the following:
� Sourcing of the batches either from the file
system or from the WebCenter
Forms Recognition database.
� The location of the batches in the file
system.
� The project file that contains the settings
used to process the documents.
� The processing steps that you want to verify,
i.e. classification, extraction, or both.
� The status of batches before and after
processing.
It also entails
configuring 508 Compliance, but this is done at the workstation level,
not the project level.
After you configure the
Verifier settings, you can load and save them using commands on the File menu. When loading or saving a
project, you can load or save a file with or without network data. When
loading, click on the file type drop-down box and select either <project
name> (*.sdp), or <project name> skip learn data (*.sdp).
Note:���� You can only work with Verifier after these settings are established. Only experienced users should change the settings.
To configure Verifier,
perform one of the following actions:
From the Options menu, select the Settings� option.
Click the Settings button on the toolbar.
The General tab is the place for general
settings. It allows you to configure your referenced directories and files.
Also, you can choose the WebCenter
Forms Recognition database as your document and statistics
source here.
The Use Project File option is used to select the path and
file name of the WebCenter
Forms Recognition project that processes the documents, and
which contains the design of the verification forms that you use to verify the
extraction.
� When you select a new project and click OK,
the project loads after returning to the Batch List.
� The title bar indicates the currently loaded
project.
� When you log in to Verifier, the system
prompts you to let you know which project is loaded.
The Use Batch Specific Project File option is active, the
project loads each time you open a batch. If you use this option, ensure that
the Allow Database Authentication option
is selected for all users in the Designer.
With the Use Database as Document and Statistics Source option, WebCenter
Forms Recognition core information can be stored in the Forms
Recognition database. Furthermore, you are able to
select the job you want from the Select
Job dropdown if you have selected the database as your source.
Note:���� The file system functionality is still supported, although use of the database is recommended.
The Display � Batches per Page option enables you to set the
number of batches displayed per page, where valid values are from 1 to 200 batches to be displayed per page. This is only available if the
Use Database as Documents and Statistics
Source option is selected. The value of 50 is set by default.
The Batch Root directory is where the batch control files are
located. This option is not available when using the WebCenter
Forms Recognition database.
The Image Root directory is where subdirectories with the
scanned images can be found. As a rule, the batch root and image root should be
the same. In special cases, for security reasons for example, the image root
can be different from the batch root.
The Client option refers to the intent to use client-specific
variables. Currently, only the default setting is available. In Designer,
project administrators can define global variables for different clients. With
the default entry, global variables do not vary by client.
If the Automatic Batch Refresh option is checked, the Batch List automatically shows newly
generated batches with matching states. If you do not want the automatic update
you can clear the checkbox. This leaves you the option to refresh the Batch List manually when you need
up-to-date information.
The Create New Image File When Cutting Document option
enables Verifier users to create new TIFF image files when a workdoc
is split into multiples. The TIFF files correspond to the new workdocs.
Note:���� This feature is disabled when the Use Database option is selected. Any project that needs to use this feature must use the file system instead of the Forms Recognition database.
The Enable Cut Keeping Cover Page option enables Verifier
users to cut a long document, such as a multipage fax, into several shorter
documents while still retaining the cover page of the original workdoc as the
cover page for each of the newly created, shorter, documents. If this is
checked, the shortcut menu in Document Browsing
view has additional menu entries. The new documents must then be OCR�d again.
When a document is
opened that requires correction or confirmation of extraction results, the cursor
is automatically placed in the first invalid field. If you select Insert mode, the cursor is inserted to
the left of the field contents. If you select Overwrite mode, the entire field content is selected.
If you select the Tab Through Invalid Fields Only option, when a user presses the [Tab] key, [Shift] + [Tab], [Ctrl] + [Tab], or [Ctrl] + [Shift] + [Tab] to tab through the fields in Document Verification mode, the system tabs through invalid fields only. When the user presses the [Tab] key inside a table control, the system tabs through invalid table cells only.
The Enable 508 Compliance option, on the General tab, activates 508 Compliance accessibility settings for
your workstation. The Enable 508
Compliance option enables 508 Compliance for all projects you work with
from this station. Users at other workstations who do not want to use these
features do not have to use them, even if they work on the same projects you
do. To enable the 508 Compliance options, complete the following steps.
When 508 Compliance
is enabled:
�
A blue
arrow shows which field has focus.
�
Additional
visual indicators besides color highlighting help distinguish between invalid
fields, valid fields, and questionable fields. These indicators are present in
table fields and form fields. Green check marks show valid fields, red Xs show invalid fields, and orange
question marks show questionable fields. Field candidates are highlighted in
yellow, but do not have additional validity icons.
�
All menu
items have underscored letters available by [Alt] menu shortcuts.
�
Pop-up
menus for workflow state lists and exception handling can be activated by the
right-click key on the keyboard. This key is on the right of the standard keyboard,
in between the Windows key and the [Ctrl] key.
�
In Show Selected Batch, the right-click keyboard key
activates the shortcut menu for Append This Document
to Previous One and Cut Pages into a New
Document.
�
During
document verification, pressing [Ctrl]
+ [M] or selecting Show Selection Context Menu activates the shortcut menu
for the currently selected item.
�
In the
highlight columns for interactive learning mode, unmapped column items are
indicated by a blue rectangle without icons while valid and invalid column
items are indicated by rectangles with a valid
or invalid icon at the left side of
every item.
If
input focus is lost for any reason, the user can manually restore it from the main
menu or by pressing [Ctrl] + [N].
The settings allow to define the workflow of documents that are processed by Verifiers. After each processing step, output states are assigned to batches, which distinguish success from failure.
To specify what to
do if the verification cannot be finished normally, select the Exception Handling tab.
A document with an
unexpected error may not be suitable for verification. Moving the document into
an exception state will flag the batch for issues.
Having a mechanism
to handle unexpected failures allows operators to remove the batch from their
task list. Then operators can manually assign special states to documents.
For each selected
state, a menu command is available in the Verification
View. The menu commands allow for case-specific handling of various types
of unforeseeable errors. The description represents the menu command�s label.
To select a state and set its description label, complete the following steps:
1. On
the Exception Handling
tab, to enable a state, select the corresponding checkbox.
Note:���� The available exception states cover the
range from 601 to 699. A batch state corresponds to the
lowest document state within the batch. Routing batches using their exception
state is only possible if the state for successful verification is greater than
the one used for exceptions.
1. To
set the description label, right-click on the existing label and select New Description from the popup menu.
2. Type
the label into the corresponding field and confirm.
Note:���� The maximum length allowed for a description is 128 characters.
The following settings are available for exception handling:
Before Moving a Document to an Exception
State, Save It Automatically |
Saves a document automatically before
moving it to an exception state. This applies only to the respective current
document. |
Create New Batches with Documents
Marked for Exception Handling |
When this option is selected, the documents that are
marked for exception handling will be moved to an exception batch. � A batch is created for each
exception code. � The new batch receives a new
batch ID. � Documents from all verified
batches are moved to the same exception batch in the Batch List. � These batches can be released
manually or automatically. When this option is disabled, documents marked for
exception handling stay in their batches. These batches keep their batch ID
but are renamed according to the state description. |
Automatically Release All Available Pending Exception Batches that
contain N or more documents or older than M minutes |
When this option is selected, an exception batch is
released once it contains more than the defined number of documents, or is
older than the defined number of minutes. This allows critical exception
documents to be processed without waiting for manual intervention. Exception
batches will also be released when user has exited the application and logged
in again. |
By Default, set exception mode to Batch |
Selecting this option changes the scope of the command Move to Exception State on Options menu to Batch automatically. |
Allow user selection of exception mode (Batch vs. Document) |
This setting enables the dynamic changing of the
exception mode on the Options menu, document verification view. By default,
this option switches on. Use this option together with the option above to
preserve a specific exception mode for the different user groups. |
To properly ensure the quality of automatically processed documents, there are two things you need to understand:
� Batches are the basic entity. WebCenter Forms Recognition works on batches and will completely process one batch before processing the next. Verifying and approving entire batches before routing to subsequent systems is an important step.
� A batch is valid only if all documents and processing results associated with the batch are valid. Because we are dealing with information and data, we do not use the terms working or damaged. Instead, we use the terms valid or invalid.
� A batch is invalid if one or more folders inside the batch are invalid.
� A folder is invalid if one or more documents inside the folder are invalid.
� A document is invalid if it has been classified automatically, but the classification result is invalid, or data has been extracted automatically from it, but at least one or more fields are invalid.
� A classification result is invalid if no matching class could be found, or the class has been changed manually and not yet validated.
� A field is invalid if it could not be filled, its content does not comply with validation rules that have been defined, or its content has been changed manually and not yet validated.
Field validation rules may be violated for a number of reasons:
� The set of allowed characters may be restricted.
� Only uppercase characters may be allowed.
� There may be restrictions on the number of characters the field can contain.
� WebCenter Forms Recognition may enforce that characters which could not be certainly identified during the OCR process must be checked. These questionable results are indicated in red and are underlined.
The Supervised Learning tab is not
available unless Supervised Learning was enabled for
the project in by the administrator. To configure
supervised learning in Verifier, with a WebCenter
Forms Recognition database, complete the following steps:
1.
To
enable Supervised Learning, on the Supervised
Learning tab, select Activate
Supervised Learning workflow.
2. In
the Local project name field, type
or browse to the local project.
Note:���� Configure
the local project with its own base directory (local learn set) and batch root.
3. In
the Knowledge base directory, type
or browse to the common learn set directory.
Note:���� The
common learn set updates whenever the local learn set is migrated to it.
4. Optional. To push the local learn set to the
common learn set, select Distribute
Local Learnset to Knowledge base and complete any of the following sub steps:
Note:���� This option automatically includes any documents added to the local learn set into a queue for the Learnset Manager to review if the documents are appropriate for the global learn set. The knowledge base is often referred to as a queue of accumulated documents or common learn set pending for review by the Learnset Manager.
� To prevent the learn set from being trained
locally, select Nominate for the
learnset but never train locally.
� Select Use
Database as knowledge source and then select a job from the Select Job list.
Note:���� To use this option, you need to have a job for a common learnset in the database. If this common learnset job is not available, create a database job in Runtime Server with the common learnset directory as the batch root.
5. Always
show state of all field locations after opening a document is reserved for future enhancements and not
available yet.
6. To create a learnset, select Apply local classification and extraction
automatically.
Note:���� When no local project or learnset is used, the global project and global learnset is used instead.
Now the SLW user can perform classification
and automatic extraction in Verifier. Also, the SLM user can now launch
Learnset Manager to verify the learnsets from the Common Learnset in the
Accumulated Documents Browsing mode.
7.
To
receive a notification in case of a discrepancy between script and commands you
ran to populate the learnset, select Prompt
if script forces or rejects insertion to Learnset.
8.
Under
Put document to Local Learnset,
select one of the following options:
� To add a document to a learnset on a user�s request, select Only if adding activated by a user.
� To add documents automatically to the learnset if the
specified threshold is exceeded, select Automatically
if more than N% invalid fields, then type the threshold value in the %
field, and optionally select Always
prompt before adding to display a confirmation dialog box before the
document is added.
9.
Under
Learn new documents, select one of
the following options:
� To initiate learning only when a user requests it, select Only by user request.
� To initiate learning for every batch in the project each time
any batch is closed, select Before batch
closing.
� To initiate learning anytime a document is added to the
learnset, select Immediately.
Supervised Learning is the interactive verification and training
of learnsets. Selecting the Activate
Supervised Learning Workflow checkbox enables Supervised Learning.
The individual Base Settings for Supervised Learning
are described below:
Local
Project Name |
The file and pathname for the local project. |
Knowledge
Base Directory |
The
file and pathname of the common learnset. The common learnset will be updated
whenever the local learnset is migrated to it. |
Distribute
Local Learnset to the Knowledge Base |
Automatically
adds any documents added to the local learnset into a queue for the Learnset
Manager to review if the documents are appropriate to be added to the global
learnset. The knowledge base is often referred to as a queue of accumulated
documents or common learnset, pending review by the Learnset Manager for
improvements into the project file. |
Nominate
for the Learnset but Never Train Locally |
This
option enables you to prevent the learnset from being trained locally. |
Use
Database as Knowledge source |
Here
you are able to select the desired job from the list. This list shows all
batch jobs in the database, when the Use
Database option is selected. |
Always
Show State of All Field Locations After Opening a Document |
Not available in this version. |
Apply Local
Classification and Extraction Automatically |
New
classes will be created using the supplier�s name. A learnset should also be
created if you select this setting. When no local project or learnset are
used, global project and global learnset will be used instead. |
Prompt
if Script forces or Rejects Insertion to Learnset |
You
will be notified if there is a discrepancy between script and your commands regarding
the population of the learnset. |
Note:���� This information should be inherited from the settings your project administrator established in Designer.
The individual settings that comprise the Put Document to Local Learnset group are described below:
Only if Adding Activated by a User |
If selected, a document will be
added to the learnset only if the user requests it. This will be done
automatically with no confirmation. |
Automatically if More Than N% Invalid Fields |
If selected, documents will automatically be added to the
learnset if the threshold you set in the associated text field is exceeded. |
The individual settings that comprise the Learn New Documents group are described below:
Only by User Request |
Learning is initiated only when
a user asks for it. |
Before Batch Closing |
Learning is initiated for every batch in the project each
time any batch is closed. |
Immediately |
Learning is initiated anytime a document is added to the
Learnset. |
Select the Advanced tab for additional features. The following table
provides a list of the Project File Updating
settings:
Activate Project File Updating |
Select this checkbox to
activate the project file updating feature. |
Source Project File Location |
The path and file name of the source project file. |
The Batch Filter function enables you to
specify filter conditions on which batches should be displayed. This is useful
if you want to find a subset of batches in a large job, or to limit Verifier
user activities.
The filtering dialog
box is accessible outside the settings dialog box so that users without a SET role but with the FLT role are able to filter batches.
Only users having the FLT role
assigned will be able to configure filter conditions.
The saved filtering
settings apply to the current batch list, and the application saves them for future
sessions.
To configure batch
filter conditions, complete the following steps:
1. From
the Options menu, select the Filtering� option, or click the Batch Filter toolbar button. The Batch Filtering Conditions Properties
dialog box is displayed.
2.
Double-click on an
entry in the left pane to select a batch attribute. The Filter Condition field is populated with the selected attribute
3.
Double-click on a
filter condition in the right pane. The selected condition is added to the Filter Condition field.
4.
Complete the filter
condition by editing the Filter
Condition field as required. The following list describes the data types
that are valid for each attribute:
�
Batch ID:�������������������� String or number.
�
State:���������������������������� Number.
�
Priority:����������������������� Number.
�
Name:�������������������������� String.
�
Folders:����������������������� Number.
�
Documents:��������������� Number.
�
Client:�������������������������� String.
�
Last User:������������������� String.
�
Last Module:������������� String.
�
Last Access:��������������� Date.
5. Click OK
to close the dialog box and apply the filter to the Batch List.
Note:���� To clear the batch filter, open the Batch Filtering Conditions Properties dialog box and click the Clear Condition button.
To specify the tasks that are to be carried out at the current Verifier station, complete the following steps:
1. On
the Verifier Properties dialog box,
select the Workflow tab.
2.
Complete one or more of the following options:
� To configure document separation, ensure the Document Separation button is pressed.
� To configure classification verification at
this workstation, ensure the Classification Verification button is pressed.
� To configure extraction verification, ensure
the Extraction Verification button is pressed.
Note:���� These steps can be performed at the workstation.
After you have
selected the workflow steps to perform, establish values for the Input State and Output State for each enabled workflow step. To add an input value,
complete the following steps:
1.
Right-click on the Input State list box and select Add State from the pop-up menu.
Note:���� You can also change states and delete states this way.
2. To
set an Output State value, select
the value from the drop-down list to the right of the workflow step buttons.
The following verification rule options are available:
Verify Document for the Lowest
Input Verification State Only |
When this option is selected, the
correction of the documents is grouped. After the verification of each input
state, the user is asked to release the batch even if there are still
documents with a higher input state left to be corrected. This option is
valuable when you use several forms to verify extraction fields. If you have
several forms defined for default processing (meaning that this option is not
selected) all forms will be shown for the document that is corrected. |
By Default Open the First
Available Invalid Batch and Not the Selected One |
If this option is selected, the first available invalid
batch is opened, rather than the batch that the user selected from the batch
list. The first invalid batch is selected based on Priority (higher first), user�s custom filter, sort settings, and
the batch ID. This is for projects with large amount of batches and
simultaneous Verifier users, and decreases time delay of project
verification. |
Perform Automatic Extraction After Classifying Documents Manually |
When selected, this option forces WebCenter Forms Recognition to attempt to automatically
extract data after the Verifier operator manually classifies the document. To
select this option, the output state of the Classification Verification workflow step must be entered as an
input state for the Extraction
Verification input step. |
Keep Showing Current Document After Saving |
When this option is selected, Verifier displays the
current document after it has been saved, instead of automatically displaying
the next document. |
Allow Immediate Copying of Selected Area to a Field or Table Cell |
When this option is selected, Verifier allows copying of
a selected area to a field or table cell when verifying. This may speed up
the verification process by copying single words and candidates to
verification elements. |
.
The first window
displayed after starting Verifier is called the Batch List because it shows a list of
batches. This is your work list.
To access the Batch
List, select the Batch List option
from the View menu, or click the Show Batch List toolbar button.
Note:���� If Verifier is not yet configured,
the list of batches will appear empty.
The following shortcuts are available in the Batch List:
Keyboard
Shortcut |
Command |
[Ctrl]
+ [1] |
Batch List View. |
[Ctrl]
+ [2] |
Verification Mode. |
[Ctrl]
+ [3] |
Document Separation Mode. |
[Ctrl]
+ [N] |
Restore Focus. |
[F5] |
Refresh. |
[Ctrl]
+ [E] |
Release Exception Batches. |
The toolbar provides quick access to some frequently used commands.
Button |
Description |
|
Display a property sheet where you can
configure Verifier. |
|
Display a dialog box where you can
configure the batch filtering conditions properties. |
|
If
you click on the arrow to the right of this button, the available filters for
the list of batches are displayed. You can select one of the following
options: �
All Batches. �
Batches to Verify, Classification Only. �
Batches to Verify, Indexing Only. �
Batches
to Verify. |
|
Start the verification of the currently selected batch.
Depending on the batch state, the batch is either displayed in the
classification window or in the indexing window. From the list you can select
one of the following options: �
Verify the selected or next batch. �
Verify
the first invalid batch. |
|
Display the batch structure of the
currently selected batch. Selecting a document shows the Document List, which provides an overview of the documents inside
the batch. |
|
Start the Learn Set
Manager application. |
The navigation toolbar enables you to easily navigate through a large number of batches. You can also configure a number of batches to appear per batch page.
Button |
Description |
|
Go to the first batch page. |
|
Go to the previous batch page. |
|
Go to the next batch page. |
|
Go to the last batch page. |
|
Refresh the batch list. |
For each batch, an icon indicates its status. When no icon is shown, the batch state is out of workflow. You can select another batch or change the settings for the workflow.
Symbol |
Status |
|
Batch is finished and ready for export. |
|
Batch requires a correction of the
classification results. |
|
Batch requires a correction of the
extraction results. |
|
Batch is locked and unavailable, as it is
in use by another application. Therefore it cannot be opened for correction. |
|
Batch contains documents with exception
statuses. When it is unavailable, it needs to be released before you can work
on it again. |
The batch list can be sorted by each column. The table columns display the following information about the batch:
Batch ID |
A number that can be used to
uniquely identify the batch. |
State |
An integer between 0
and 999 that indicates the
progress of batch processing. The state also indicates whether the batch is
ready for verification. |
Priority |
An integer between 1
and 9 that indicates how urgent it
is that a job be finished, where 1
is the highest priority and 9 is
the lowest. |
Name |
An arbitrary name that is easier to read than the batch
ID. Because the name is optional, it might be missing or set to a default
value. |
Folders |
Documents in a batch can be grouped in structures called folders. The value in this column
indicates the number of folders inside the batch. |
Documents |
The value in this column indicates the number of
documents inside the batch. |
Client |
N/A; as this column is not
currently used. |
Last User / Module |
The computer name of the operator who has processed the
batch before and the name of the application that most recently processed the
batch. |
Last Access |
Displays the date when the batch was last processed. |
External Group ID |
Optional. The group ID which has been assigned to a batch relating to security. Batches can be assigned to user group via a unique ID. This column is not displayed by
default. |
External Batch ID |
Optional. The name of the batch group. This can be used to represent any piece of information you would like to associate with a batch. For example, an external system ID or storage box ID. This column is not displayed by
default. |
Transaction ID |
Optional. The transaction ID assigned to a batch. This
allows the developer to synchronize a newly created batch of documents with
another external system. It can be used to identify originators of batch of
documents. This column is not displayed by
default. |
Transaction Type |
Optional. The transaction type assigned to a batch. This
allows the developer to synchronize a newly created batch of documents with
another external system. It can be used to identify the types of documents in
batches or the source of the documents. This column is not displayed by
default. |
Note:���� The
External group ID, External Batch ID, Transaction ID, and Transaction Type
columns do not display by default.
You can sort any
column in Batch List. To sort any item,
click on the title of the column.
Batches sort
according to their position on the list. If you select the first batch, and
then click the Batch column label, it moves to the bottom
of the list. For other items, the values toggle between ascending and descending
order, whether numeric or alphabetical.
To select a batch in
the table of batches, simply click on it. You can then move through the list
using the following keyboard commands:
�
To move
to the first document, press the [Home]
key.
�
To move
to the next document, press the [Down] cursor key.
�
To move
to the previous document, press the [Up] cursor key.
�
To move
to the last document, press the
[End] key.
�
To move
one page down, press the [Page
Down] key.
�
To move
one page up, press the [Page
Up] key.
To leave the Batch
List and switch to another view, use one of the following keyboard commands:
�
To verify the selected batch, press [Ctrl] + [2].
�
To view the selected batch, press [Ctrl] + [3].
Document List displays the batch structure of the currently selected batch. Selecting a document provides an overview of the documents inside the batch. You can use Document List to investigate the documents in a selected batch.
To open Document List, click the Show Document List toolbar button.
The following keyboard shortcuts are available in Document List:
Keyboard
Shortcut |
Command |
[Ctrl]
+ [P] |
Print. |
[Ctrl]
+ [Alt] + [Home] |
First document. |
[Ctrl]
+ [Alt] + [Page Down] |
Next document. |
[Ctrl]
+ [Alt] + [Page Up] |
Previous document. |
[Ctrl]
+ [Alt] + [End] |
Last document. |
[Ctrl]
+ [8] |
Append document. |
[Ctrl]
+ [9] |
Cut document. |
[Ctrl]
+ [Enter] |
Accept/reject next unsure age. |
[Ctrl]
+ [Space] |
Select next unsure page. |
[Ctrl]
+ [1] |
Display the Batch List. |
[Ctrl]
+ [2] |
Start the verification of the selected batch. |
[Ctrl]
+ [3] |
Display the selected batch in Batch List. |
[Ctrl]
+ [N] |
Manually restore input focus without
using the mouse. |
[Ctrl]
+ [+] |
Zoom in. |
[Ctrl]
+ [-] |
Zoom out. |
[Ctrl]
+ [Left] |
Move image to left. |
[Ctrl]
+ [Right] |
Move image to right. |
[Ctrl]
+ [Up] |
Move image upwards. |
[Ctrl]
+ [Down] |
Move image downwards. |
[Ctrl]
+ [R] |
Rotate the image. |
[Ctrl]
+ [Home] |
First page in document. |
[Ctrl]
+ [Page Down] |
Previous page in document. |
[Ctrl]
+ [Page Up] |
Next page in document. |
[Ctrl]
+ [End] |
Last page in document. |
[Ctrl]
+ [M] |
Show selection context menu. |
[Ctrl]
+ [Z] |
Undo. |
[F7] |
Reclassify manually. |
[F3] |
Show last verified document. |
[F8] |
Get last value for selected field. |
[F9] |
Move to exception state. |
[Ctrl]
+ [E] |
Release exception batches. |
[Ctrl]
+ [L] |
Apply local extraction. |
[Ctrl]+
[A] |
Add document to learnset. |
[Ctrl]
+ [T] |
Correct tables. |
[Ctrl]
+ [Q] |
Switch table highlighting. |
The toolbar provides quick access to some frequently used commands.
Button |
Description |
|
Display a property sheet where you can
configure Verifier. |
|
Display a property dialog box where you
can configure the batch filtering conditions. |
|
Display the Batch List. |
|
Start the verification of the currently
selected batch. Depending on the batch state, the batch is either displayed
in the classification window or in the indexing window. A dropdown list
allows users to verify the selected batch or verify the next invalid batch. |
|
Display
the available filters for the batch structure. You can select from among the
following options. �
All documents �
Documents to Classify �
Index Documents to Classify �
Documents
to Index |
|
Start the Learn Set
Manager application. |
|
Display the first page of the selected
batch, or a single page of the selected document. |
|
Display the first two pages of the
selected batch, horizontally, or the first two pages of the selected
document. |
|
Display the first three pages of the
selected batch, horizontally, or the first three pages of the selected
document. |
|
Display the first two pages of the
selected batch, vertically, or the first two pages of the selected document. |
In the batch
structure, Verifier displays a hierarchical
representation of the batch contents.
The levels of this
hierarchy are:
� Batch.
� Folder.
� Document.
For each document entry,
Verifier provides the following information:
ID |
A number that can be used to
uniquely identify the batch, folder, or document. |
State |
An integer value between 0 and 999 that
indicates the progress of batch processing. The batch state is calculated
from the states of its folders. It corresponds to the lowest value of all
folder states. The folder state is in turn calculated from the states of the
documents. It corresponds to the lowest value of all document states. |
Name |
An arbitrary batch or folder name that is easier to read
than the ID. Because the name is optional, it might be missing or set to a
default value. |
Document Class |
A document�s classification result. This entry might be
missing if the document has not been classified. |
To navigate in the batch structure, choose from among the following keyboard commands:
� To move to the first document, press [Ctrl] + [Alt] + [Home].
� To move to the next document, press [Ctrl] + [Alt] + [Page Down].
� To move to the previous document, press [Ctrl] + [Alt] + [Page Up].
� To move to the last document, press [Ctrl] + [Alt] + [End].
� To expand or collapse a folder, double-click on it, or click the plus sign or minus sign next to it.
In the document list, you can split multipage documents into separate documents, with the exception of the first page of a document, which cannot be split. You can also merge consecutive pages of documents into one with multiple pages.
To split a multipage document, complete the following steps:
1. From
the View menu, select Show Selected Batch, then All Documents.
2.
Select Show document list.
3. Click
a multipage document you want to split.
4. Right-click
the second page, then select one of the following options:
� Select Cut Pages into a New
Document to split the document into two documents.
� Select Cut pages into New Document
Keeping Cover Page to split the document into several smaller
documents and corresponding TIF files. Using this option includes the cover
page of the original document as the cover page for the newly created
documents.
Note:���� This option is not available until you have marked a page as a cover page. You can do this by right-clicking on the first page of the document and selecting Mark as Cover Page.
��������������� If you have
changed the list sorting, such as switching the batch ID to descending order,
the cut operation is not available.
5. Optional.
If you have changed the list sorting, complete one of these options when
following prompt displays:
Would you like to
switch back to the original sequence of the documents in batch?
� Click No to
keep the sorting and disregard the cut operation.
� Click Yes
to revert to the original sorting and cut the document.
To append a document to another, complete the following steps:
1. From
the View menu, select Show Selected Batch, then All Documents.
2. Select
and right-click the document to append to the previous document.
3. Select
Append This Document to Previous One.
Note:���� If you have changed the list sorting, such as switching the batch ID to descending order, the append operation is not available.
The viewer toolbar allows you to adjust the magnification used to display documents using the following commands.
Button |
Description |
|
Fits the document to window height. |
|
Fits the document to window width. |
|
Provides the best fit for an image. |
|
Zooms in. |
|
Zooms out. |
The document area
shows the first page of the document that has been selected in the batch
structure.
It is possible to
default Verifier to display a specific page of each document
instead of the first one. For more information, refer to VerifierFormLoad event in Oracle
WebCenter Forms Recognition Scripting
User�s Guide.
To print a document,
from the File menu, select the Print option.
Note:���� This function is available in all modes of Verifier with the exception of Batch Browsing Mode.
You can configure the
amount of data on a printed form from the Print Setup dialog box. To access the
Print Setup dialog box, select Page
Setup� from the File menu.
The order in which
the fields are printed is defined by the order of the fields configured in the
project.
When a document
prints, the header includes the document file name and the document class.
The following print setup
options are available:
Print Image |
When selected, also prints
pages of the document file. |
Print Form |
Activates printing of the verification form. |
Print Hidden Fields |
When selected, Verifier
prints not only the fields visible on the current verification form, but all
the fields available in the loaded document. |
Print Table Fields |
When deselected, Verifier
does not print table fields. Disabling this option might be useful for quick
printing of documents with long tables. |
Print column header on each printed table page |
Enabled only if the Print Table
Fields option is selected. When this option is selected, Verifier
prints column headers on each page. This option is useful for printing long
tables. This option is selected by default. |
Always show the dialog box when printing |
Displays the Print Setup dialog box when users press
print. |
Verification involves taking a document that has been processed, or partially processed, checking the processing results, and correcting any errors.
When you open verification view, the classification window displays automatically if the next document that is to be verified requires a correction of the classification result. Whether this is the case depends on the state of the document.
To display Verification View, complete the
following steps:
1. Select
a batch from the list that requires verification.
2. Click
the Verify Selected Batch button.
Alternatively, double-click the batch in the list.
The following keyboard shortcuts are available in the classification window:
Keyboard
Shortcut |
Command |
[Ctrl]
+ [Alt] + [Home] |
First document. |
[Ctrl]
+ [Alt] + [Page Down] |
Next document. |
[Ctrl]
+ [Alt] + [Page Up] |
Previous document. |
[Ctrl]
+ [Alt] + [End] |
Last document. |
[Ctrl]
+ [1] |
Display the Batch List. |
[Ctrl]
+ [2] |
Start the verification of the selected
batch. |
[Ctrl]
+ [3] |
Display the selected batch in Batch List. |
[Ctrl]
+ [N] |
Manually restore input focus without
using the mouse. |
[Ctrl]
+ [+] |
Zoom in. |
[Ctrl]
+ [-] |
Zoom out. |
[Ctrl]
+ [Left] |
Move image to left. |
[Ctrl]
+ [Right] |
Move image to right. |
[Ctrl]
+ [Up] |
Move image upwards. |
[Ctrl]
+ [Down] |
Move image downwards. |
[Ctrl]
+ [R] |
Rotate the image. |
[Ctrl]
+ [Home] |
First page in document. |
[Ctrl]
+ [Page Down] |
Previous page in document. |
[Ctrl]
+ [Page Up] |
Next page in document. |
[Ctrl]
+ [End] |
Last page in document. |
[Ctrl]
+ [M] |
Show selection context menu. |
[F3] |
Show last verified document. |
[F8] |
Get last value for selected field. |
[Ctrl]
+ [L] |
Apply local extraction. |
[Ctrl]
+ [J] |
Increase image area. |
[Ctrl]
+ [K] |
Decrease image area. |
[F7] |
Reclassify manually. |
[F9] |
Move to exception state. |
[Ctrl]+
[A] |
Add document to learnset. |
[Ctrl]
+ [E] |
Release exception batches. |
[Ctrl]
+ [T] |
Correct tables. |
[Ctrl]
+ [Q] |
Switch table highlighting. |
[Ctrl]
+ [Z] |
Undo. |
The toolbar provides quick access to some frequently used commands.
Button |
Description |
|
Display the Batch List. |
|
Verify the selected batch. |
|
Display the Document List. |
|
The
scope of this command depends on the Exception
Mode set on the Options menu.
Two options are available: �
Move Document to Exception State. �
Move Batch to Exception State. Clicking
the arrow next to this button displays a list of exceptions. You can use
these exceptions if you cannot correct a document at all, for example,
because it belongs to none of the defined classes. Check with your
administrator to determine which exceptions to use. Note that in order to avoid selection
conflicts, only the toolbar button provides a list of exception handling
states to choose from. The selection made here will also apply if you move a
document to exception state by selecting the appropriate option within the Options menu. |
|
Fit the current image to the height of
the window. |
|
Fit the current image to the width of the
window. |
|
Fit the current image to the width or
height of the window for maximum enlargement. |
|
Zoom in. |
|
Zoom out. |
|
Display the first document in the batch. |
|
Display the previous document in the
batch. |
|
Display the next document in the batch. |
|
Display the last document in the batch. |
|
Rotate the current document clockwise. |
|
Display the first document in the batch
and switches the application to Browsing
Mode. |
|
Display the previous document in the
batch and switches the application to Browsing
Mode. |
|
Display the next document in the batch
and switches the application to Browsing
Mode. |
|
Display the last document in the batch
and switches the application to Browsing
Mode. |
|
Display the first page of the document if
the current document has more than one page. |
|
Display the previous page of the document
if the current document has more than one page. |
|
Enter
a page number in order to navigate directly to it. All invalid entries, for
example, alphabetical characters and page numbers out of range, are ignored,
and the page number is reset to the currently displayed page. |
|
Display the next page of the document if
the current document has more than one page. |
|
Display the last page of the document if
the current document has more than one page. |
This box shows the classification result of the current document. If you open the list, you see all available classes.
The list entries represent the classes assigned to the current project or user, and are controlled by the Verifier Classify script event.
If no result could be determined, the box shows as empty.
To set or change a classification result, make sure that you are not in browsing mode, and then perform one of the following actions:
� Click on the arrow on the right side of the list box to open the list, and then select a class.
� Use the cursor keys to browse through the list of classes. The entries in the list are sorted alphabetically.
� If you know the correct class name, you can type its first characters and wait until the system automatically displays the full class name.
To select a class result using advanced classification, complete the
following steps:
1. Click
the Classification Matrix
button.
Note:���� This button is not available unless advanced classification has been enabled for the project by the administrator in Designer.
A list opens containing one or more classes if the result could not be
determined with 100% certainty. If more than one class is in the list, the
class entry is determined by probability. The class with the highest
probability is at the top of the list.
2. Select
a class for the current document and then click OK.
The indexing window displays fields and documents specific to your organization. The layout of the window can be customized by a project designer.
The indexing window automatically displays the next document that requires a correction of the extraction result.
To display
verification view, complete the following steps:
1. Select
a batch from the list that requires verification.
2. Click
the Verify Selected Batch toolbar
button.
To increase or
decrease the image area, complete the following step:
Drag the vertical split bar between the image area and field area either to the right or left.
Note:���� This option is only available in the
extraction verification view.
The following keyboard shortcuts are available in the indexing window:
Keyboard
Shortcut |
Command |
[Ctrl]
+ [P] |
Print. |
[Ctrl]
+ [Alt] + [Home] |
First document. |
[Ctrl]
+ [Alt] + [Page Down] |
Next document. |
[Ctrl]
+ [Alt] + [Page Up] |
Previous document. |
[Ctrl]
+ [Alt] + [End] |
Last document. |
[Ctrl]
+ [1] |
Display the Batch List. |
[Ctrl]
+ [2] |
Start the verification of the selected
batch. |
[Ctrl]
+ [3] |
Display the selected batch in Batch List. |
[Ctrl]
+ [N] |
Manually restore input focus without
using the mouse. |
[Ctrl]
+ [L] |
Apply local extraction. |
[Ctrl]+
[A] |
Add document to learnset. |
[Ctrl]
+ [J] |
Increase image area. |
[Ctrl]
+ [K] |
Decrease image area. |
[F9] |
Move
to exception state. |
[Ctrl]
+ [E] |
Release exception batches. |
The toolbar provides quick access to some frequently used commands:
Button |
Description |
|
Display the Batch List. |
|
Verify the selected batch. |
|
Display the Document List. |
|
Starts the Learn Set Manager application. |
|
Click the down arrow next to the button for
a list of exceptions. You can use these exceptions if you cannot correct a
document at all, such as when the required data is illegible. Please check
with your supervisor to determine which exceptions to use. |
|
Marks all areas on the current document
that have been used to fill the fields. If the result is valid, the area is
highlighted in green. If the result is invalid, the area is highlighted in
red. |
|
Marks only the area on the current document
that was used to fill the field that is currently selected in the field area.
If the extraction result is valid, the area is highlighted in green. If the
extraction result is invalid, the area is highlighted in red. |
|
Marks the area that was used to fill the
field that is currently selected in the field area. This area either appears
in green or in red. In addition, all other areas that were taken into account
to fill this field are highlighted in yellow. |
|
Fit the current image to the height of
the window. |
|
Fit the current image to the width of the
window. |
|
Fit the current image to the width or
height of the window for maximum enlargement. |
|
Zoom in. |
|
Zoom out. |
|
If this button appears pressed down, the
application always displays the document area that is associated with the
currently selected field. |
|
Keeps the established zoom settings on
each document you view in the batch. |
|
Display the first page of the document if
the current document has more than one page. |
|
Display the previous page of the document
if the current document has more than one page. |
|
Enter
a page number in order to navigate directly to it. All invalid entries, for
example, alphabetical characters and page numbers out of range, are ignored,
and the page number is reset to the currently displayed page. |
|
Display the next page of the document if
the current document has more than one page. |
|
Display the last page of the document if
the current document has more than one page. |
|
Rotate the current document clockwise. |
|
Display the first document in the batch
and switches the application to Browsing
Mode. |
|
Display the previous document in the
batch and switches the application to Browsing
Mode. |
|
Display the next document in the batch
and switches the application to Browsing
Mode. |
|
Display the last document in the batch
and switches the application to Browsing
Mode. |
|
Applies local classification and
extraction to the current document. |
|
Adds current document to local learnset. |
|
Starts document learning. |
|
Starts table correction. |
Verifier supports the mouse wheel when validating documents in Document Verification mode. Mouse wheel
rolling has the following effect depending on where the mouse cursor is or
where the keyboard focus is:
Case |
Wheel
Rolling Effect |
Input focus is in a multi-line header
field. |
Scrolls between lines of the header
field. |
Input focus is in a single line header
field or at the first line / row of any field (scrolling up only) or at the
last line / row (scrolling down only). |
Scrolls the entire verification form. |
Input focus is in a table field. |
Scrolls between table rows or between
multiple lines of the currently selected table cell (when multi-line). |
Mouse pointer is in the Document Viewer area. |
Scrolls the currently viewed page image
up and down. |
A form has three
main elements: a label, a viewer, and a field.
Labels |
Labels are captions that help users to identify form
fields, as well as viewers and tables. |
Viewer |
A viewer contains snippets of document areas, normally
those that were extracted to fill fields or tables. |
Fields |
A field will display data and
allow for entering or editing of data. Fields might be either a text field,
table field, check box, list box, or a Yes/No selection. You can use fields
to create check boxes and combo boxes. |
In the field area, the following icons are used to indicate the nature of the field:
Button |
Description |
|
Indicates the currently selected field. |
|
Indicates a valid extracted field. |
|
Indicates a field that needs to be
validated because it was extracted with low confidence. |
The following list explains the field types:
1. A user cannot edit or select a Read Only field.
2. An Auto-completion field enables the user to edit text by typing the
first few letters of a word until best matching candidate appears.
3. A Multi-line field enables line wrap and displays a vertical scroll
bar. This field type is a requirement for address analysis.
4. A List box contains a selection list to verify an item in a document.
5. Check boxes are toggle selections for data input that derive from form fields.
To navigate the field area, select one of the following
options:
� Use
the mouse. This method does not affect the validation state of a field.
� Press
the [Tab]
key. This method gets you to the next field, but not to the next document. This
method does not affect the validation state of a field.
Note:���� The order that the [Tab] key moves through the form is part of the form�s design.
� Press
the [Shift] + [Tab] key to go to a previous field.
� Press
the [Enter] key. This method validates the entire field or the next invalid
character within a field. Once the field is corrected, it is validated and the
focus moves to the next field that requires correction. This field may also be
within another document.
The document area
shows the currently selected document or page along with highlights.
� Red areas indicate an invalid result.
� Green areas indicate a valid result.
� Yellow areas were considered as candidates,
but another candidate seemed more likely. If the extraction result is invalid
or wrong, these areas may point to the correct indexing data.
Note:���� In practice, red, green, and yellow areas never appear in the same document.
To
navigate the document area, choose one of the following options:
� To
highlight the entire document table, click the square in the upper-left corner
of the table field.
� To
highlight a document column, click the column label of a table field.
� To
highlight a document row, click the row label of a table field.
� To
highlight a document cell, click the cell of a table field.
Note:���� Valid areas are green and invalid are red. These areas may also contain validity icons, which are green check marks for valid fields or red Xs for invalid fields.
Only one
table will display per verification form, even if you are able to define
multiple tables. However, you can display different tables on different forms.
If you
only need to verify certain columns in a table, you can make the other columns
invisible. All invisible columns must be valid for the entire table can be
valid.
For a
large document with many line items, you can detect and view the location of
all the extracted line items that are currently shown within a table field.
The current input area provides a large editing box and shows the following enlarged information for the currently select field.
� A snippet that shows an enlargement of the document area that was used to fill the field.
� The extracted data. Color coding is used in the same way as in the field area. You can edit the data here.
The user information area is at the bottom of the Verifier window, and consists of three fields that display the following information:
� The name of the currently selected field.
� If the current field is invalid, the reason is displayed. If the current field is valid, the field is normally empty.
� The classification result of the current document.
If, during the
automatic page separation process in Runtime Server, there was at
least one unsure page-level decision for a batch of documents, the whole batch
receives the state Failed Page Separation.
Such a batch is supposed to be manually reviewed and, if required, corrected in
Verifier.
You can correct the
automatic page separation results in the Document
Browsing mode of Verifier. When the next batch is
opened, the system automatically displays the first unsure split or merged
page.
The available
options for automatic page separation are listed below:
Toggle the Unsure Status |
Select the Accept / Reject Next Unsure Page menu command or click [Ctrl] + [Enter]. This command sets the page to the manually accepted state or to the manually rejected state, respectively. There are three different states of page correction status: blue page icon for extracted with high confidence by the engine, blue page icon with a red question mark for extracted with low confidence by the engine (unsure) and blue page icon with green check sign for manually accepted / corrected by the Verifier user. These states are retained after the user closes the batch in Verifier and can be reviewed by other users. If all pages of a document become accepted (the pages extracted with high confidence are accepted by default), the document is redirected to successful page separation state. If at least one of the document�s pages becomes manually rejected, the entire document receives the lowest page separation failed state that is configured in Verifier settings. |
Split the Document into Two Separate Documents |
Select the Cut
Document menu command or press [Ctrl] + [9]. The top document receives
all the pages above the selected one while the bottom document receives all
the pages below, including the selected page. In this case, the page
correction status is automatically applied to the selected page and the
preceding page. If you split previously merged documents, the original
document names are restored. |
Merge the Selected Document with the Previous One |
Select the Append
Document menu command or press [Ctrl] + [8]. In this case, the first page
of the selected document and the last page of the proceeding one are accepted
for a manual page correction status. |
Go to the Next Unsure Page |
Select the Next
Unsure Page menu command or press [Ctrl] + [Space]. This action selects
the next unsure page to verify (the one with red question mark) without
changing the state of any pages. |
Manual correction of
classification results is done if the Verifier
workstation is configured with the following settings:
� Classification verification is enabled.
� Extraction verification is disabled.
To determine your
settings, check the Workflow tab of
the Verifier Properties dialog box.
If you do this task
regularly, you may want to apply the appropriate filter in the Batch List. From the View menu, select the Batch Filter option, then select Batches to Verify, Classification Only.
To correct invalid
classification results, complete the following steps:
1. In
Batch List, check the state column to
find a batch you can verify.
2. Open
the selected batch in the Verification View.
3. The Verification
View opens in Verify Mode, with
the first invalid document being displayed. The cursor is already placed in the
classification list box.
4. To
select a class, either:
� Click on the arrow on the right side of the
list box to open the list and then select a class.
� Use the arrow keys to browse the list of
classes and make your selection. The entries in the list are sorted
alphabetically.
� If you know the correct class name, type its
first characters and wait until the system automatically displays the full
class name.
5. To
confirm your selection, press
the [Enter] key. The application validates the document and its state increases. The
next document requiring verification is displayed automatically.
When all documents
in the batch are validated, the application prompts you to release the batch. Click
the Yes, No, or Details button, as appropriate. Clicking on Details reveals more options:
� Verify Next Invalid Batch on the List releases the current batch and opens the
next batch that needs verification.
� Close Batch and Return to the Batch List releases the current batch and displays the Batch List, where you can select the
next batch.
� Verify This Batch with the Next Verification Form changes verification forms and displays the
next verification form.
Deleting a class will make the class obsolete. Often
with Supervised Learning workflow, classes become obsolete because the global
project�s configuration deletes or just does not insert the class.
The only way to process obsolete document classes is
if the class still exists in the project that the document is processing with.
Information saves internally about the former parent class assignment, which
makes it possible to process obsolete document classes.
Manual correction of
extraction results will be done if the Verifier
workstation is configured with the following settings:
� Classification verification is disabled.
� Extraction verification is enabled.
To determine your
settings, check the Workflow tab of the Web
Verifier Settings or the Verifier Properties dialog box.
If you do this task
regularly, you may want to apply the appropriate filter in the Batch List. From the View menu, select the Batch Filter option, then select Batches to Verify, Extraction Only.
To correct invalid
results, complete the following steps:
1. In
Batch List, check the state column to
find a batch you can verify.
2. Open
the batch in the Verification View.
The Verification View opens
in Verify Mode and the first invalid
document displays. The application places the cursor in the first invalid field
and the user information area contains a message indicating why the field is
invalid.
A form can include
the following elements:
Form Fields |
Display extracted data. You can
also enter and edit data during manual indexing. You can use form fields to
create checkboxes and combo boxes. |
Labels |
Identify form fields, viewers, and tables. |
Viewers |
Are sections of document areas, normally those that were
extracted to fill fields or tables. |
Buttons |
Fire actions for a new script event. |
Tables |
Extracted from documents. |
The following is a list of field types and their description:
Read Only |
When selected, information on a
field is dimmed and cannot be selected or edited. |
Auto-Completion |
Enables you to edit text in a field by typing the first
two letters of a word. Auto-completion finishes the word with the best
matching candidates. |
Multi-line Fields |
Required in the context of address analysis but can also
be useful in other cases. A multi-line field enables line wrap and displays a
vertical scroll bar, if required. |
List Box |
A dropdown box that lists predefined strings related to
the verification document. It can either show the nearest values
automatically or show only selected values. |
Checkbox |
A toggle selection for one of two choices of the data
input for a field, for example, yes/no. |
Verifier and Web Verifier include automated features for editing text fields
that can speed up text entry and correction.
You can use
automatic character entry, when the auto-completion is enabled in the form
field Properties dialog box, to edit text fields and cells. Other options for character
changes include multi-line fields, combo boxes, and checkboxes. You can also
insert and replace text in table cells and fields, either in single words or
blocks of text, using drag-and-drop or by double-clicking on the selected text.
Multi-line fields
are necessary for address analysis but can also be useful in other cases. A multi-line
field enables line wrap and displays a vertical scroll bar, if required. To add
a new line to a multi-line field press [Ctrl]
+ [Enter].
A combo box lists
predefined strings related to the verification document. To aid in
verification, you can select from the list of strings.
The checkbox
provides a binary option that toggles table data entry choices on and off. For
example, with a Yes or No checkbox, checking Yes would bring up data entry related
to the verification and unchecked for No
would hide them.
Auto-completion
helps to speed up typing. When you start to type, auto-completion completes the
word, suggesting the best match among all of the words or candidates available
after OCR and format analysis.
For example, you can
type the first two characters of a 20-character invoice number. The auto-completion
feature finds the best matched candidate suggested by the format analysis
engine and places it in that field. The auto-completion feature for a header
field automatically selects the best candidate from the available ones if
Verifier is in Highlight Candidates
mode. However, the viewer will be updated only if the candidate appears once in
the document; otherwise the viewer will be blank when auto-text completes the
word for the field.
The automatically
selected text also appears highlighted in the original document. Select whether
a single-line or a multi-line text field should be displayed. To override
auto-completion, continue typing the desired text.
Note:���� Auto-completion does not work on formatted text and characters incorrectly read by OCR.
To speed up verification, you can insert words to replace or append text.
The method for inserting words depends on the availability of candidates. A candidate is one that matches the learned words for that field. It will appear in green, with a border of green check marks if that visual indicator is enabled in Batch Options, when you select it after selecting the field. Non-candidates will display in orange when selected. You can insert words in fields or table cells. You can append or insert words and use the mouse to append or replace the field.
To use the Append feature, the selected word must appear on the same line as the existing cell text. If not, the selected word will replace the existing cell text.
Words that are candidates for cells: If the word belongs in a cell area, you can append or replace a word in a cell. The Append feature takes the current word behind the candidate and appends the cell text. It places the text in the best location, either to the right or left of the word, and in the cell location based on the text or location of the word. The word belonging to a cell area displays in green when selected. Or, you can replace the text. In the search region, word candidates are all words that are not covered (by location) by other table cells and that have the same beginnings as the whole text of the cell.
Words that are not candidates for a cell: If the word does not belong to cell areas, it displays in orange when selected. Even if it is not a candidate, you can append or replace the word. Appending places the text in the best location, either to the right or left of the word, by text or location of the word. Or you can replace the cell text and location by the text and location of a word. For example, a cell named C2658 might be appended by "number" or you can replace the cell text and location by the text and location of a word.
If a word is a
candidate for a field, you can append or replace the word in a field box. A
candidate is a word that matches the learned selections for the field. To copy
text to the field box, complete the following actions:
� Click on the text you want to copy. A box
appears around the word.
� Double-click
the box or right-click in the document and select Copy to Current Field.
Note:���� You can insert only one candidate per field per document verification session.
Even if the word
does not belong to any candidates for the field, you can append or replace a
word with a new one. Appending places
the text in the best location, either right or left of the word, by text or
location of the word. Or you can replace the field text and location by the
text and location of a word. A word that does not belong to any candidates for
that field will display in orange when selected. To use a word that is not a candidate
for field, complete one of the following options:
� To
append text with the new text, drag a box around the desired word. Double-click
on the word in the box or right-click in the document and select Append Field Text by Word.
� To
replace text, select the word with the mouse. A box will appear around the
word. Double-click it, or select Replace Field Text
by Candidate in the shortcut menu.
Note:���� You can insert only one candidate per field per document verification session.
��������������� Make sure that this word fits the format analysis rules defined for that field. If not, the word is highlighted in orange (and with a border of orange exclamation marks if validity icons are enabled) to help distinguish it. If so, it would not be a good candidate for the field.
Inserting large
blocks of text with minimal mouse movement is helpful when you have multiple
word data verification elements for fields such as address information or cell
descriptions. Before you can insert blocks of text, first select the settings
in the Workflow dialog box to immediately copy information. To insert large
blocks of text, complete the following steps:
1.
In the image
viewer, drag over the text.
2.
Optional. Use the
handles to adjust the selection.
3.
Drag the selection
to the field or table cell.
4.
Optional.
Double-click on the selection. The selected text replaces the text in the field
or table cell.
1. After
you correct a field, press the
[Enter] key
to validate it.
During validation, the field�s background
color appears in yellow, and the cursor becomes an hourglass. Once the
validation is finished, the cursor moves automatically to the next invalid
field regardless of whether this field is in the same invalid document, or in
the next invalid document. If you leave a document this way, it is validated
automatically. In the next field, proceed as described above. When all
documents in the batch are validated, the application prompts you to select
what to do next.
2. Click
Yes or No, or click Details. Clicking Details reveals the following options:
� Verify Next Invalid Batch in the List releases the current batch and opens the
next batch that needs verification.
� Close Batch and Return to the Batch List releases the current batch and displays the Batch List where you can select the next
batch.
� Verify This Batch with the Next Verification Form changes verification forms using the next
verification form.
Simultaneous
correction of classification and extraction results is available if your
workstation is configured with the following settings:
� Classification verification is enabled.
� Extraction verification is enabled.
� Automatic extraction after classification is disabled.
Note:���� If you do this task regularly, you may want to apply the appropriate filter in the Batch List by selecting Batch Filter and then Batches to Verify from the View menu.
Organizations usually collect information about themselves and everybody they do business with. Much of this information is stored in databases. Databases can be an excellent support for indexing because they store related information that can easily be retrieved. During indexing, if you have extracted one piece of information from a document, you can obtain related pieces from the database and fill the associated fields automatically. This method is called smart indexing.
Normally, smart indexing is combined with manual indexing. Some fields of a form have to be filled in manually; some fields can be filled automatically.
For example, assume that your organization saves information related to orders in the database of its ERP system. Every order is characterized with a unique identifier and some attributes about the supplier and the items that have been ordered. Soon after an order is placed, the ordered items are delivered, and a delivery note is attached. The corresponding invoice follows soon. The delivery note and the invoice refer to the original order. They have the order�s unique identifier printed on them. With this identifier, you can look up supplier information from the database when you verify the delivery note and invoice. However, new information such as the invoice date has not yet been entered into the database. This information can be supplied manually.
To use smart
indexing, complete the following steps:
1. Smart
index fields can be recognized by the key
icon that is displayed next to them. Select a smart index field. The field
itself and all the fields that can be filled through the database lookup are
marked with a yellow database icon.
2. If
the field is still empty, enter the field value. Alternatively, enter a
wildcard expression, using an asterisk to represent a sequence of characters or
a question mark to represent a single character.
3. Complete
one of the following options to start the lookup:
� If your application is configured accordingly
and the field content is correct, validate the smart index field by pressing the [Enter] key.
� Press [Alt]
+ [F12].
4. The
system may respond with one of the following options:
� If the lookup yields no results, a corresponding
message is displayed. Fill the lookup fields manually. If you cannot complete
the fields, send the document to exception handling.
� If the lookup yields one result, the lookup
fields are filled.
� If the lookup yields multiple results, and
this is allowed in your application, the lookup fields are filled.
� If the lookup yields multiple results, and
this is not allowed in your application, a dialog box is displayed where you
can select the correct record. The lookup fields are then filled accordingly.
To browse through all documents in a batch, complete the following
steps:
1.
In Batch List, use the status value to
determine a batch you can browse through.
2. Open
a batch in Verification View. The
first document that requires correction is automatically displayed.
3. To
display the first document in the batch, press [Ctrl] + [Alt] + [Home] or use the
appropriate toolbar button.
4. You
may encounter a document that has been classified incorrectly. To correct this
result, press the [F7]
key to open the classification window. To correct the
class, select the corresponding entry from the list box at the bottom, then
confirm by pressing the
[Enter] key.
5. This displays the indexing window again.
6. To
correct extraction results, type your corrections into the corresponding field.
If a field has been changed, its state is set to invalid. Press the [Enter] key
to validate the field you modified, and then press [F3] to return to the
document.
7.
To get to the next
document, press [Ctrl] +
[Alt] + [Page Down] or use the appropriate toolbar button.
8. Repeat
the above steps as appropriate until you reach the last document.
You can correct invalid cells the same way you would
correct an invalid text field.
Additional methods
to simplify manual table extraction, such as the Correct Table, are available.
In the case that
automatic table extraction fails to recognize the line items properly, Verifier
provides several ways for convenient manual table correction.
Auto-completion works in table cells and with text fields. When you type two or more characters, auto-complete suggests a word or phrase for that cell.
The candidate appears in green if the field is valid and red if the field is invalid. If the visual validity icons are enabled in Batch Options, valid fields also have a border of green check marks and invalid fields have a border of red question marks. This function only works with Highlight Candidates mode.
You can insert
single words or append existing text in table cells. To insert words in table
cells, complete the following steps:
1.
To select the text you want to insert in a
table cell, complete one of the following actions:
� Double-click on the word.
� Right-click on the word in the image viewer
and select the respective option.
� If you have candidates, double-click the
desired candidate to replace it.
2.
To append text with the new text, select Append Cell Text by Word.
3.
To replace text with the new text, select Replace Cell Text by Word.
Even if the word
does not belong to any candidates for the cell, you can insert single words,
append or replace existing text in table cells. To insert words in table cells,
complete the following steps:
1. To
append text with the new text, double-click the word or right-click in the
image viewer.
2. Select
Align & Copy to Current Field.
3. To
replace text, select the word, and then select Copy to Current Field in the shortcut
menu.
You may need to correct the table structure. Table rows, cells, and columns have shortcut menus with options for modifying the table structure. To invoke them, right-click on the row, cell, or column label.
The available commands are summarized in the table below:
Shortcut
Menu |
Command |
Description |
Column |
Unmap |
Clears all data for the selected
verification column and turns the state of the corresponding column of the
recognized table back to unmapped.
To view an unmapped column, double-click on the table header in the
verification form. All unmapped columns are highlighted in red. |
Column |
Map |
Adds the column selected from the
shortcut menu, or you can right-click on an unmapped column to map it to a
column in the verification form. |
Column |
Swap |
Exchanges the position of the current
column and the one selected from the dropdown menu. |
Row |
Insert |
Inserts an empty row above the current
one. |
Row |
Delete |
Deletes the current row. |
Row |
Duplicate |
Duplicates the current row. |
Row |
Append |
Appends an empty row at the bottom of the
table. |
Row |
Merge |
Merges cells in a row. |
Cell |
Insert Cell |
Inserts an empty cell above the selected
cell while shifting the cells below down. |
Cell |
Delete Cell |
Deletes the selected cell. The cells
below are shifted up. |
The rubber-banding feature allows you to select a block of data on a document and place this block of values at a particular point within the table.
Note that the Table Correction mode must be switched off for the menus described in the following sections, though a mixed usage of manual and engine driven table correction is possible.
To auto-scroll when using the rubber-banding feature, use one of the following options:
� If
the target document area is displayed only partially, such as due to the zoom
level, move the mouse outside the document area while rubber-banding to scroll
the document and to select the entire desired data.
� If
you want to re-size a rubber-banded area, drag the corner of the rubber-band
rectangle. The window will auto-scroll if you want to select values outside of
the visible area.
There are two ways to add column data to a table:
If whole rows are missing after extraction, use this option for better accuracy. Note that the application observes the relationship between the columns and maps the values appropriately. Already extracted cell values would be shifted up or down by the insertion creating new rows.
This is very comfortable when at least one column has been extracted by 100%, and other table columns contain random values. With this option, the column data can be added by blocks, overwriting the previously extracted entries.
To insert column data, perform one of the following actions:
� To
insert column data above a correctly extracted and filled cell, click into the
filled cell and select Insert Column Table
Data from the menu. This creates additional rows and
shifts already existing rows up. At the same time, the values will be
automatically assigned to already extracted values of other columns if
available.
� To
insert column data below a correctly extracted and filled cell, click into the
next empty cell below and select one of the two options depending on whether or
not you want to keep already extracted cell entries.
To replace column
data, click into the desired starting point cell and select the Replace Column Table Data option.
Provided that the
rubber-banded area spans more lines than already contained in the table, the
additional subsequent lines will be added as additional rows continuing the
table.
To use the rubber-banding feature, complete the following steps:
1. Place
the cursor into the destination cell within the table.
2. Draw
a rubber-band rectangle around the column data on the image within the Document Viewer.
3.
Right-click the selection on the image and
click either Insert Column Table Data
or Replace Column Table Data
in the popup menu, depending on your task.
A use case for the
rubber-banding feature is if columns of one data type are split up and placed
side-by-side, or stacked, on a page. To correct such a table, complete the
following steps:
1. Place
the cursor into the first line of the column.
2. Draw
a rubber-band rectangle around the first block of column data on the document.
3. Right-click
the selection and click Insert Column Table
Data.
4. To
append further column data, right-click the row�s node of the last table row
and select Append Row.
5. Place
the cursor into the next empty cell of the desired table column.
6. Draw
a rubber-band rectangle around the next data block of the same column type and
select Insert Column Table Data.
7. Proceed
the same way with other columns to fill the table.
A use case for the rubber-banding feature is if column items have been
extracted only partially. To correct such a table, complete the following
steps:
1. Place
the cursor into the already extracted cell to mark the starting point for the
insertion.
2. Draw
a rubber-band rectangle around the column data on the document.
3. Right-click
the selection and select Insert Column Table
Data from the popup menu.
4. Now
continue mapping the other column data.
A use case for the
rubber-banding feature is if column items are missing from the neighboring
column. The following example shows how to insert missing values:
1.
Place the cursor
into the Reference cell.
2.
Perform the steps
as described in Use Case Two.
A use case for the
rubber-banding feature is if you have documents where the data columns appear
misaligned. To correct this situation, map the column data
in blocks, as described for previous use cases.
You may need to correct the table structure if for instance an unnecessary cell has been mapped to the table or if a missing cell has to be added.
You may have documents where one of the line items is missing. During extraction, the values from below might be shifted up to fill the empty space.
For this, you have the possibility to add or remove single cells.
To delete a cell, complete the following steps:
1. Click
the cell within the table to place the cursor in it.
2. Right-click
and select Delete Cell (Shift Cells Up) from the popup menu.
To insert a cell, complete the following steps:
1. Click
the cell within the table that is subsequent to the cell candidate and place
the cursor in it.
2. Right-click
and select Insert Cell (Shift Cells Down) from the popup menu.
This creates an empty cell within the table above the selected cell.
Now, you can copy the desired value from the document into the newly created
cell.
The learning process for the Brainware
Table Extraction engine consists of two phases:
� Learning lines
� Learning mappings of columns
These are discussed in detail in the following sections.
Note that functionality is available for the Supervised Learning
Verifiers. With the Generic Table
Extraction, no extra learning is needed.
The Brainware Table Extraction engine
considers the following main types of the lines:
Primary Line |
A line that defines table
structure. The engine applies advanced and precise similarity analysis for
all primary lines. It is important that all primary lines are well-structured
and that they look similar in many of the rows to extract. The engine easily supports
an unlimited number of types of primary lines for one table definition. The
primary line must contain at least four words. Otherwise, the engine will not
learn it. In addition, the primary line must be the first line in the table
row. |
Secondary Line |
A line between primary lines. The engine applies smooth
similarity analysis for these types of lines, which is possible because Brainware Table Extraction only
searches the area between two neighboring primary lines. This allows the
engine to extract data that varies widely, which often happens with
multi-line descriptions. There is also no limit to the number of words in
secondary lines, and no limit to the number of secondary lines. However, a
document's page must have at least one primary line; otherwise, secondary
lines on this page are not extracted. |
Wrong Line |
A primary line that is learned as a negative line sample.
In other words, all lines classified by the engine as members of one
particular wrong line class are not
extracted. In principle, it is possible to learn an unlimited number of wrong
lines, though the current restriction is that this will only take effect
during in-document learning. Cross-document learning (that is, learning the
whole document after all the fields are completely valid) may not
automatically train the wrong lines. |
After it learns any type of line, the Brainware Table Extraction engine automatically creates and manages
a new line class (cluster). Afterward, all lines in the document considered by
the engine to be members of the line class (similar to the learned line sample)
will be extracted, or not extracted in the case of wrong lines.
It is possible to
learn an unlimited number of different line classes. However, the overall
quality may suffer if too many lines are learned.
Learning lines can
be applied in lines learning (or lines highlighting) mode. Mapping of the
column data in the lines can be done in column mapping learning (or columns
highlighting) mode. The user can switch between learning (highlighting) modes
with the Switch Table Highlighting
menu option in the Options menu, or with the context menu options Show Lines and Show Columns.
When learning the mapping for columns, the user trains the engine on how the data from the extracted lines must be mapped to the user's table data.
For primary lines, this mapping can be defined differently for different line classes. For example, if a user learned two different line samples that went to two different lines classes internally in one document, the user can then map Unit Price in the document to the Unit Price data column, and the Total Price to the Total Price for the first line sample. For all lines of the second line type, the user can map Unit Price to Total Price, and Total Price to Unit Price. For the next document, the Brainware Table Extraction engine will always use the first set of mapping rules for the lines classified to the first line type, and the second set of mapping rules for the lines classified as the second line type.
If you have several Brainware Table Extraction tables in one class, the learnset is shared between these tables. In other words, if you used interactive learning for one Brainware Table Extraction table, cross-document learning (which happens if the system added the document to the learnset after document validation) is applied for all Brainware Table Extraction tables in the document.
Any time you train a table interactively, complete the required training first and then manually verify the table.
Brainware Table Extraction can train line types and column mapping for each type of line.
When working with interactive table extraction, learn the lines before you map the columns.
Because of the way interactive table verification works, you cannot manually delete data from a cell. Rather, if you want to discard cell data, un-map the column and re-extract the table to remap the column. Although it will seem as if you deleted the data, the data is still there until you un-map the column.
This section describes the simplest way to use interactive Brainware Table Extraction learning. If
this method does not work, proceed to the advanced method described in the
following sections. To use the standard method, complete the following
procedures:
2. Learn mapping in the row you learned.
4. Learn and adjust the mapping of missing or
wrong columns.
5. Manually correct the table date and validate
the table.
1. Select
your Brainware Table Extraction table
by clicking any table field inside the table grid.
2. Click
the Correct Tables button.
3. In
the lines highlighting mode, use the Learn as Row
function to show the row sample. This function automatically learns the first
line as a primary line and the rest
of the lines as secondary lines. This
function is also available by double-clicking on the selected row area. Select
the whole first row and learn it.
Note:���� The visual indicators for valid, invalid and questionable table lines are as follows: valid lines are highlighted green, invalid lines are highlighted gray, and questionable lines are highlighted blue.
4. Optional.
To learn a new line as a primary line, complete the following actions:
� Right-click
on any line marked in gray in the Image Viewer.
� On
the popup menu, select Learn Line.
� The learned lines change from gray to green,
or to blue if the line is extracted with low confidence.
5. Optional.
To learn a block of lines as primary lines, complete the following actions:
� In the Document Viewer, draw a rectangular selection over the primary lines.
� Right-click on the selection.
� On the shortcut menu, select Learn as Primary Line(s).
� All correctly selected primary lines will be learned and highlighted in green (or blue if the line is selected with low confidence), and all other lines will be similarly extracted and displayed.
Note:���� If some lines were not extracted, try relearning the lines singly or in a block.
6. Optional.
To learn a lines block as a table row, complete the following actions:
� In the Document Viewer, draw a rectangular selection over the required multi-line (or single-line) table row.
� Double-click or right-click on the selection.
� From the popup menu, select Learn as Row.
If some lines were not extracted, repeat the procedure described above.
Do not try to learn the rest of missing secondary or primary lines now. This is because mapping is defined on the basis of line type. If you would train all different line samples now, you would need to learn the columns mapping separately for every line class. In order to reduce time to train the table, first learn the column mapping for the row you just learned. If you then want to learn another line sample, the engine will apply the existing mapping rules for the newly learned row automatically.
Green highlighting indicates a line is extracted with high-confidence; blue highlighting - with low-confidence. If the confidence for a blue line is less than 0.3 (moving the mouse cursor over the highlighted lines shows the confidence value as a tool-tip) then the lines will not be extracted.
Blue highlighting has also the following important meaning: this line can be trained by the engine as a new line class.
All correctly selected primary lines will be learned, and all other lines are similarly extracted and displayed.
1. Switch
to the columns highlighting mode now (using [Ctrl] + [Q])
and mark the location of your first cell item in the row you learned.
2. The system displays a special mapping control
asking for the data column to extract the data to.
3. Select
the required data column by double-clicking on it.
4. Repeat
this step for the rest of the cell items in the first row.
1. Switch
back to the lines highlighting mode.
2. Mark
the next missing row and learn it as described before.
3. Repeat
steps 1 and 2 for all rows on all pages where something is missing. Go to the
next step only after you are sure nothing is missing.
1. Return
to columns mapping learning mode and look for wrong or missing mapping. Correct
any missing mapping.
2. If
you can�t map the missing columns, switch back to the lines highlighting mode
and try to learn the row where the mapping is missing.
3. Switch
to columns highlighting. If the mapping is still missing, mark the missing part
and map it.
Note:���� The Brainware Table Extraction engine may determine the mapping automatically.
4.
Repeat steps 1 - 3
until the data is completely extracted or cannot be learned correctly.
Note:���� There is always a chance that you will not get 100 percent extraction results.
Switch
to cells highlighting mode and manually correct missing data, OCR errors, and
so on.
Optional. Click the Add Current Document to Local Learnset button to add the document
to the Learnset and then learn it, which is known as cross-document learning.
Do this if the system did not suggest learning the document automatically.
Note:���� The only requirement for
cross-document learning is correctness and completeness of the table data to
train. This means that location, content, and format of every cell item is
accurate.
After table learning
and validation are complete, and the rest of the document�s fields are
validated, you may want to add this document to the learnset and then learn it.
This is cross-document learning, in
contrast with in-document interactive
Brainware Table Extraction learning.
If the system did not suggest learning the document automatically, but you still would like to learn your table, activate learning by clicking the Add Current Document to Local Learnset toolbar button.
Note:���� The only requirement for cross-document learning is correctness and completeness of the table data to train. This means that location and content of every cell item should be correct. Also, ideally, the content of cell items should not be formatted.
This section discusses the special cases in which it is necessary to use secondary lines explicitly. There are two such cases:
� The table row begins on one page and ends on the next.
If a table row begins on one page and ends on the next page, you must use the Learn as Secondary Lines function (in lines learning mode) to train missing secondary lines (on the next page). In this case, these secondary lines are placed right before the first primary line on the page. Mark all the secondary lines as before: right-click and select Learn as Secondary Lines.
Never use the Learn as Row function in this case, as this tells the engine that the first secondary line is actually a new sample of primary line. As a result, the engine may split extracted table data into new rows.
� Learning of unmapped secondary lines leads to unwanted extraction.
Your project may require that data from secondary lines not be extracted. Usually, this will not be a problem, but sometimes the engine extracts the data from these lines anyway. In this case, not learning these secondary lines will prevent unwanted extractions. Use the Learn as Secondary Lines function instead of Learn as Row if you would like to learn just selected lines and not all lines that belong to the row. You can also Unlearn Line to correct or adjust the extraction.
The Unmap Column method can undo mapping for the specified cell item.
This will undo mapping for all cell items that were extracted from the lines that belong to the same line type as the cell item used to invoke the Unmap Column method.
To undo incorrect column mapping, complete the following steps:
1. Right-click
on any unassigned column (highlighted in blue) or draw a rectangular selection
over the cell items to be mapped to a table column.
2. On
the shortcut menu, select Undo Mapping.
The previously assigned column (highlighted in red) is now unassigned. The
values are no longer extracted or in the table grid.
To learn a block of
secondary lines, complete the following steps:
1. In
the Document Viewer, use the mouse to
draw a rectangular selection over the required secondary lines of a desired
multi-line row.
2. Right-click
on the selection.
3. On
the popup menu, select Learn as Secondary
Line(s).
All correctly selected primary lines will be
learned and highlighted in green (or blue if the line is selected with low
confidence) and all other lines are similarly extracted and displayed. If some
lines were not extracted (these lines will not be color-coded), repeat the
procedure described immediately above.
The Unlearn Line function
can be used to discard previously applied learning for a particular line. To do
this, Brainware Table Extraction uses
a line sample, searches for the line type, and removes the line type from the learnset.
To unlearn a line, complete the following steps:
1. Switch
to Lines Learning mode and
right-click on the line you want to unlearn.
2. On the shortcut popup, click Unlearn Line. Unlearned lines change from green to gray.
Learning a wrong line means
to train the table such that a particular line will not be extracted. This
applies to other lines of the same type in the table. To learn a line as a wrong
line, complete the following steps:
1. Right-click
on any learned line or draw a rectangular selection over the required lines.
2. On the popup menu, select Learn as Wrong Line. The selected lines and similar lines to it are now highlighted in gray. Information from these lines will not be extracted.
The basic purpose of
Learn Set Manager is to use Supervised Learning
to improve the quality and usefulness of your enterprise�s Learnsets.
With Supervised
Learning, Supervised Learning Verifiers,
and LearnSet Managers, you can customize your project�s learnsets by adding or
subtracting documents, reclassifying them or creating altogether new classes or
learnsets, and migrating documents there. They can also promote local learnsets
to a global learnset so that it can be shared across the enterprise.
In general, Learn
Set Manager consists of:
� Creating new classes based upon documents
themselves and supplier information.
� Learning documents and adding them to the local
learnset.
� Using the local learnset to improve the
extraction of low-quality documents.
� Maintaining local learnsets.
� Updating and enhancing the global learnset
with information from the local learnsets.
Although Supervised
Learning was created for use with vendors� invoices, it can also
be used with other types of knowledge. For example, a library might create
classes based on type of material, subject matter, or author. Most of the
illustrations and examples in this chapter use invoices.
Learn Set Manager can only be launched from Verifier.
You have access to Learn Set Manager mode only if you
have been assigned to a group that has permission to work with the mode. There
is no limit on the number of users who can simultaneously access Learn
Set Manager.
To start Learn
Set Manager, complete the following step:
� Click
the Start Learn Set Manager button
in the Verifier toolbar.
Learn Set Manager has two basic modes, or views. These are the
Accumulated Documents View, where you
work with local learnsets, and the Global
Learnset View, where you work with common learnsets and global learnsets.
When you are working
with local learnsets, you can further refine the appearance of the Accumulated Documents browser when you
verify documents or manually reclassify them.
The Accumulated Documents Browsing view has the following sections:
Batch Viewer |
Enables you to see each class
in the batch you are working on. The Batch
Viewer shows each class as part of the batch, the user who created the
batch, the date it was created, the number of documents in the batch and in
each class, the number of documents successfully classified and the number of
documents successfully extracted. You will need to enlarge the window to see
all of these categories. |
Document Viewer |
As with the Document Viewer in Verifier, this window enables you to see (and therefore verify) each document in the batch you are working on. You can verify documents in Document Viewer. |
Learning Statistic Window |
Shows the documents that have been processed by WebCenter Forms Recognition. Documents that are awaiting
processing have a question mark beside them. Successfully processed documents
have a check mark, while documents that failed processing have an X. |
The Global Learnset Browsing view has the following sections:
Batch Viewer |
Enables you to see each class
in the batch you are working on. Shows the classes in the global learnset,
and the number of documents classified or extracted in each. |
Document Viewer |
As with the Document Viewer in Verifier, this window enables you to see (and therefore verify) each document in the batch you are working on. You can verify documents in Document Viewer. |
Learning Statistic Window |
Shows the documents that have been processed by WebCenter Forms Recognition. Documents that are awaiting
processing have a question mark beside them. Successfully processed documents
have a check mark, while documents that failed processing have an X. |
The following menu
commands and keyboard shortcuts are available in Learn Set Manager:
Keyboard
Shortcut |
Command |
[Ctrl]
+ [E] |
Closes Learn Set Manager. |
[F5] |
Verifies documents in Supervised
Learning. |
[F7] |
Manually reclassifies document. |
[F1] |
Opens Learn Set Manager help. |
The Learn Set Manager toolbar provides quick access to some frequently used commands:
Button |
Description |
|
Show settings. |
|
Switch to Accumulated
Documents Browsing (the local learnset). |
|
Switch to global learnset processing. |
|
Verify documents. |
|
Correct tables. Allows you to correct data in the
tables. You have to click a table field for this to be active. |
|
Accept documents. |
|
Reject documents. |
|
Learn documents (add them to the global
learnset). |
On the Viewer toolbar, you can use the following commands to adjust the size of a document relative to the width of the Document Viewer window:
Button |
Description |
|
Fits the document to window height. |
|
Fits the document to window width. |
|
Best fit. |
|
Zooms in. |
|
Zooms out. |
Use Learn
Set Manager to work with local learnsets. First, verify the
documents, decide whether they belong in the learnset, add them to the common learnset,
and train the learnset.
In the common learnset,
you examine the documents for inclusion in the global learnset, accept or
reject them, and add them to the global learnset. Finally, you train the global
learnset.
To configure LearnSet Manager, complete the
following steps:
1. In
Verifier, examine the properties for LearnSet Manager. Most were established in
Designer. However, you need to ensure that Learn Set Manager is enabled, that
the Activate Supervised Learning Workflow
checkbox is checked. Also, ensure that the paths for Local Project Name, Local
Learnset Directory, and Knowledge
Base Directory are correct. If you are using the Forms
Recognition
database, you will select a database job instead of specifying a Knowledge Base Directory path.
2. Launch
the Learn Set Manager module from Verifier by clicking the Learn Set Manager button on the toolbar.
3. On
the Learn Set Manager toolbar, click the Settings button.
4. Review
the settings as follows:
Show Learned State Using
Engine-Level Information |
Make sure this option is
checked. This setting indicates whether the particular field or document was
used by the system for learning. If required, a user can also disable
learning for the desired field/document. |
Automatically Reject Document
if Number of Pages Exceeds |
Documents with more pages than specified in the appropriate field will be prevented from being added to the learnset. |
Inherit Verifier Settings |
If this option is selected, the settings made on the
Verifier settings tab will be applied, and the options below will be grayed
out. If you
want Learn Set Manager to us a different global project, or a different job
containing data, then clear this option, and populate the options below as to
your needs. |
Use Database as Document and Statistics
Source |
WebCenter Forms Recognition core information can be stored
in the Forms Recognition database. |
Select Job |
You are able to select the desired job from the Select
Job dropdown if you have selected the Forms Recognition database as your source |
Accumulative (common) batch
root path |
This option to set the path to the batch root is not
available when using the database. |
Automatic Backup |
Select the files you want to be backed up automatically. � Project file. � Project learnset. � LSM train set. |
To work with common learnsets,
complete the following steps:
1. When
you launch Learn Set Manager module from Verifier, the Accumulated Documents Browsing Mode will be shown by default. This
is the mode you use to work with common learnsets, and it is activated when you
click the Switch to Accumulated Documents Browsing
toolbar button or when you select Accumulated
Documents Browsing from the View menu.
2. In
the Accumulated Documents Browsing Mode,
select a batch to work on.
3. Double-click
on a class to select it.
4. Select
a document to work on and verify the document just as you would in the
traditional Verifier. Click the Advanced
Verifier Mode toolbar button to correct or verify the contents of each
field and table.
5. After
you have verified the document, click the Accept button. This marks the document for learning
as the first step for promotion into the global learnset. You could also click
the Reject button to
eliminate the document from being considered for the global learnset.
6. Select
another document from the batch by accessing the Learn Statistics Panel at the bottom of
the screen, where you�ll double click on a document to open it in the Document Viewer and work on it.
7. When you have verified all the documents you need to verify, click the Learn Documents button. This promotes all the accepted documents to the global learnset. Notice that the Learn Statistics window for the local learnset is now empty.
To correct tables,
complete the following step:
� Select
a table in a document and then click the Correct Tables button.
This enables Supervised
Learning Managers and Verifiers to interactively train all the
tables on a document form, not just the table you selected so you could activate
the Correct Tables button. From
there, table correction in Learn Set Manager proceeds just
as it does in Verifier.
To reclassify a
document, complete the following steps:
1. To
assign a document to a different class, select the Document menu from the main menu and select Reclassify. Alternatively, press the [F7] key.
2. This
opens a dialog box in the Verifier Document Viewer
where you can assign the document to a new class.
3. Select
the new class from the list and press the [Enter] key.
Note:���� Manual reclassification will only succeed with the classes that Verifier is currently using; not the classes that have already been learned.
There are two ways
to accept or reject a document or batch from a learnset. The first is the
traditional way, by using Verifier to manually screen and
verify the document or batch. The other method is by comparing documents in the
common learnset to the corresponding batch in the global learnset.
Note:���� If you enable the Automatically Reject Document if Number of Pages Exceeds� option in the LearnSet Manager settings, documents with more pages than specified in the appropriate field will be prevented from being added to the learnset. The user is notified when a document exceeds the number of allowed pages, and the process continues by choosing Yes in the pop-up windows.
Learn Set Manager can sort by vendor name across multiple
batches produced by different local supervised learning Verifiers. This
simplifies Supervised Learning Workflow decisions as to which documents to
train for a specific vendor class. With the help of this feature, the user can
now review all documents (for the same vendor class) created by different Verifier
workstations at once.
From the View menu, select Sort Batches by Vendor, and the system rebuilds the batches of
documents created through multiple sessions by multiple Verifier
workstations, and allows the Learn Set Manager user to sort
by vendor name. In this case, each vendor folder accumulates all available
documents for this vendor, so that the user could select the best documents to
train the global project with.
The Created On and Created By data fields are displayed separately for each particular
document in the Document List at the
bottom of the Learn Set Manager window.
Note that in the Global Learnset browsing mode, the options to sort batches by vendor and sort batches by date are disabled. Both of these sorting options are enabled only in Accumulated Document browsing mode. However, in Global Learnset browsing mode, the vendor classes are sorted in alphabetical order. Furthermore, under the Display Class Name column, the class name is shown only for the document classes that already exist in the global project.
The global learnset
is where you further refine the quality of your data before migrating it into
an effective and useful global knowledgebase. To work with global learnsets,
complete the following steps:
1. Click
the Global Learnset Viewing button.
2.
Begin your work at
the document level by examining the quality of the data in the document. As
before, you select the document from the Learning Statistics window at the
bottom of the screen.
3.
Right-click the
document if you are satisfied with it, and then select Enable.
4. Select
Use to Train Base Classes.
5. Click
the Learn button.
6. Confirm
that you do want to learn the document by clicking Yes.
7. After
you have verified each field in the document, click the Accept button on the toolbar to accept the
document for processing, or click the Reject button
to reject it.
8. Now
retrain the learnset by clicking the Learn Documents button
on the toolbar.
The final milestone
in creating or enhancing your global learnset is to train base classes. To
train a base class, complete the following steps:
1. On
the Options menu, select Train Base Classes.
2. Select
the base class to train.
3. Under
Train Selected Base Classes,
select a value. To avoid errors while maintaining the quality of your sample,
select the lowest value possible.
4. Click
OK.
The ability to
update local projects is important for keeping your learnsets synchronized.
During the work with Verifier, the global project�s learnsets
are constantly updated. An administrator may then wish to update the
local out-of-date projects with a new global project. The administrator adds
all the local projects into the main list, points to the global project
template for overwriting, and press Update
to update them.
To update local
projects, complete the following steps:
1. On
the Options menu, click Update Local Projects.
2. Configure
the selections described below and click Update. This procedure can take a while, especially
if you are updating projects on a network. Note that locked projects will not
be updated. The Update Local Projects dialog box specifies a list of local or
network project paths to be managed. Here, you can:
� Add project paths to the Update list by
clicking Add and browsing to the project.
� Remove paths from the Update list by clicking
Remove and browsing to the project.
� Change existing project paths by clicking Change and browsing to the project.
� View a history of all Verifier workstations
that have connected to the common learnset. The History shows the workstation
name, the time, and date of its last connection and the local project path.
This list is updated every time a Verifier
station creates a new batch of locally learned documents in the common learnset.
3. Refresh the list of projects to see the most
recent update information about them.
4. Empty the template cache. The Empty Template Cache Project field is used to update the
local project.
5. Update the list of projects or save the new
criteria without actually updating the projects.
For each configured
path, the dialog box shows whether a project is up to date, whether it is
locked, and whether the project is available. An up-to-date project has a green
check mark beside it; a project that has not been updated has a red X. Path names of unavailable projects
are dimmed. Note that the settings you establish above will be available for
any workstation on which Learn Set Manager is opened.
Learn Set Manager can be used simultaneously on more than one
workstation, thanks to the application�s ability to lock projects and files.
Learn Set Manager supports three levels of protection that
facilitate this ability:
� Batch-level locking.
� Allowing all Learn Set Manager
workstations to view changes made by all Supervised Learning
Managers.
� Locking project files and learnsets while
they are being trained.
Batches are locked when they are in process. This prevents several users from updating the same batch at the same time. No one else can access the batch until processing is completed and the batch is closed.
The changes applied
by one Learn Set Manager user should be visible to all other Learn
Set Manager users. To accomplish this, Learn Set Managers must
use two predefined batch document states:
� 981:������� Accepted.
�
982:������� Rejected.
When learning is
executed, the Learn Set Manager application checks
to see if documents with either of these assigned states have been added to the
local learnset. Documents with a state of 981
are added to the global learnset. Documents with a state of 982 are not added to the Global Learnset.
Only one workstation can do learning at one time. This means that the learning process is locked, and therefore not available for other users, if one user has initiated learning.
The frequently asked questions and their answers below may help to resolve some situations that you might experience during the extraction and validation process.
Q. In one of my batches, there is a document
that must be classified manually, but it does not belong to one of the
available classes. I cannot release the batch as it is. What can I do to finish
my job?
A. Normally, your organization will have specialized workstations where people are in charge of handling special cases that only occur as exceptions.
Q. In one of my batches, there is a document I
have already validated. However, I�ve overlooked a mistake in this document. I
don�t want to release the batch without correcting it.
A. You can use the Document Mode to get to the document. Select the document and switch to Verify Mode. Make corrections and press the [Enter] key.
Q. Sometimes the indexing window looks unusual.
It has no field area, only the current input area. How do I get to the next
field?
A. This is not a problem. You can use all keyboard shortcuts for field navigation from within the current input area.
Q. When I switch from one field to the next,
the document is not moving as well. I find this annoying. Is there a way to
stop that?
A. The application always searches the document area associated with the current field�s content. This area is then displayed. To turn this off, click on the Keep Focus button in the toolbar. Alternatively, you could just use a different magnification ratio.
Q.
I tried to start Learn Set Manager,
but I do not want to have to go through Verifier first. Can I do it?
A. No. Learn Set Manager
is an add-in that can only be started in Verifier.
Q.
I tried to start Learn Set Manager
in Verifier, and I still can�t do it. Why?
A. There are three reasons for this:
� You may not have permission to use this add-in. Check with your project administrator to see if you are assigned to a group that can work with Learn Set Manager mode.
� Learn
Set Manager might not be
enabled for the project. Again, contact your project administrator.
� A third reason might be that Learnset Manager mode is not properly licensed. Learnset Manager gets its license through a Runtime Server process. If you�ve been able to get into to Learnset Manager mode before, but you cannot now, it may be that Runtime Server has stopped.