Oracle Fusion Middleware

Oracle WebCenter Forms Recognition Verifier User's Guide

12c Release 1 (12.2.1.4.0)

E93589-02

 

 

 

 

 

September 2019

Documentation for WebCenter Forms Recognition Verifier, that describes how to configure and use the application to verify documents, and to perform Supervised Learning Workflow.

 

 


Oracle Fusion Middleware Oracle WebCenter Forms Recognition Verifier User's Guide, 12c Release 1 (12.2.1.4.0)

E93589-02

Copyright � 2009, 2019, Oracle and/or its affiliates. All rights reserved.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:

U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.

This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.

This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.

Contents

1���� What Is Verifier?.................................................................................................................................................. 7

1.1������� What Happens When Verifier Process Documents?................................................................................. 7

1.1.1������� Verifier Key Features:.................................................................................................................................. 8

1.1.2������� Verifier Functionality Highlights................................................................................................................. 8

1.2������� Some Helpful Terms....................................................................................................................................... 8

2���� About WebCenter Forms Recognition Workflow................................................................................ 11

2.1������� Start and Exit Verifier................................................................................................................................... 11

2.2������� About Logging into Verifier........................................................................................................................ 11

2.3������� About Specifying Login Information with Command Line Arguments.............................................. 12

3���� Roles������������ 13

3.1������� Change Your Password................................................................................................................................ 13

4���� About Configuring Verifier.......................................................................................................................... 14

4.1������� Configure Verifier......................................................................................................................................... 14

4.2������� About the General Settings.......................................................................................................................... 14

4.2.1������� About Specifying the Project File............................................................................................................... 14

4.2.2������� About Specifying the Directories............................................................................................................... 15

4.2.3������� About Specifying Client Settings............................................................................................................... 15

4.2.4������� About Specifying the Batch Options.......................................................................................................... 15

4.2.5������� About Fields Edit Mode............................................................................................................................ 16

4.2.6������� About Specifying Tabbing Behavior.......................................................................................................... 16

4.2.7������� Enable 508 Compliance.............................................................................................................................. 16

4.3������� About the workflow settings....................................................................................................................... 16

4.4������� About Exception Handling Settings.......................................................................................................... 17

4.4.1������� Select States............................................................................................................................................... 17

4.4.2������� Configuring Exception Handling.............................................................................................................. 17

4.5������� Quality Assurance with WebCenter Forms Recognition........................................................................ 18

4.6������� Configure Supervised Learning................................................................................................................. 19

4.6.1������� Supervised Learning Settings.................................................................................................................... 20

4.7������� Advanced Settings........................................................................................................................................ 21

4.8������� About Batch Filtering................................................................................................................................... 21

4.8.1������� Configure Batch Filter Conditions............................................................................................................. 21

5���� Configure Tasks to Perform at the Workstation.................................................................................... 23

5.1.1������� Configure the input and output states....................................................................................................... 23

5.1.1.1������ About Verification Rules..................................................................................................................... 23

6���� Getting Familiar with the User Interface................................................................................................. 25

6.1������� Show Batch List............................................................................................................................................. 25

6.1.1������� Batch List Keyboard Shortcuts................................................................................................................... 25

6.1.2������� Batch List Toolbar Buttons......................................................................................................................... 25

6.1.3������� Navigation Toolbar Buttons...................................................................................................................... 26

6.1.4������� Batch List Icons.......................................................................................................................................... 26

6.1.4.1������ Batch List Columns............................................................................................................................. 26

6.1.4.2������ Sort in Batch List................................................................................................................................. 27

6.1.4.3������ Select a Batch in Batch List.................................................................................................................. 27

6.1.4.4������ Leave the Batch List............................................................................................................................ 28

6.2������� About the Document List............................................................................................................................ 28

6.2.1������� Open Document List................................................................................................................................. 28

6.2.2������� Document List Keyboard Shortcuts........................................................................................................... 28

6.2.3������� Main Toolbar Buttons................................................................................................................................ 29

6.2.4������� About the Batch Structure Area................................................................................................................. 29

6.2.5������� Navigate in the Document List.................................................................................................................. 30

6.2.5.1������ About Splitting and Appending Documents....................................................................................... 30

6.2.5.2������ Split a Multipage Document............................................................................................................... 30

6.2.5.3������ Append Two Documents.................................................................................................................... 31

6.2.6������� Viewer Toolbar Buttons............................................................................................................................. 31

6.2.7������� About the Document Area......................................................................................................................... 31

6.2.8������� Print a Document...................................................................................................................................... 32

6.2.8.1������ Printing Verified Data Content............................................................................................................ 32

6.2.9������� About the Verification View Classification Window................................................................................. 32

6.2.9.1������ Open the Verification View Classification Window............................................................................ 32

6.2.9.2������ Classification Window Keyboard Shortcuts........................................................................................ 32

6.2.9.3������ Verification View Classification Window Toolbar Buttons................................................................. 33

6.2.9.4������ About the Class Selection List............................................................................................................. 34

6.2.9.5������ Set or Change a Classification Result.................................................................................................. 35

6.2.9.6������ Select a Class Result Using Advanced Classification........................................................................... 35

6.2.10����� About the Verification View Indexing Window........................................................................................ 35

6.2.10.1���� Open the Verification View Indexing Window................................................................................... 35

6.2.10.2���� Increase or Decrease the Image Area................................................................................................... 35

6.2.10.3���� Indexing Window Keyboard Shortcuts............................................................................................... 35

6.2.10.4���� Verification View Indexing Window Toolbar Buttons......................................................................... 36

6.2.10.5���� Support for the Mouse Wheel............................................................................................................. 37

6.2.10.6���� Form Area........................................................................................................................................... 37

6.2.10.7���� Field Area............................................................................................................................................ 38

6.2.10.8���� Navigate the Field Area...................................................................................................................... 38

6.2.10.9���� About the Verification View Document Area...................................................................................... 38

6.2.10.10������� Navigate the Document Area......................................................................................................... 39

6.2.10.11������� About Tables in the Document Area............................................................................................... 39

6.2.10.12������� About the Current Input Area........................................................................................................ 39

6.2.10.13������� About the User Information Area................................................................................................... 39

7���� About Working with Verifier....................................................................................................................... 40

7.1������� Manual Correction of Automatic Page Separation.................................................................................. 40

7.2������� About Manual Correction of Classification Results................................................................................ 41

7.2.1������� Manually Correct Classification Results.................................................................................................... 41

7.3������� About Processing Documents with an Obsolete Class............................................................................ 41

7.4������� About Manual Correction of Extraction Results...................................................................................... 42

7.4.1������� Correct Invalid Results.............................................................................................................................. 42

7.4.1.1������ Form Elements and Field Types.......................................................................................................... 42

7.4.1.2������ About Editing Text Fields................................................................................................................... 43

7.4.1.3������ About Auto-Completion..................................................................................................................... 43

7.4.1.4������ About Inserting Words in Fields......................................................................................................... 43

7.4.1.5������ Use a Word that is a Candidate for a Field.......................................................................................... 44

7.4.1.6������ Use a Word That Is Not a Candidate for a Field.................................................................................. 44

7.4.1.7������ Insert Blocks of Text............................................................................................................................ 45

7.4.2������� Finish the Validation................................................................................................................................. 45

7.5������� About Manual Correction of Classification and Extraction Results..................................................... 45

7.6������� About Smart Indexing.................................................................................................................................. 46

7.6.1������� Use Smart Indexing................................................................................................................................... 46

7.7������� Check Entire Batches.................................................................................................................................... 46

8���� About Working with Tables......................................................................................................................... 48

8.1������� About Manual Training and Correction Methods................................................................................... 48

8.1.1������� About Auto-Completion with Tables........................................................................................................ 48

8.1.2������� Insert Candidate Words in Table Cells...................................................................................................... 48

8.1.3������� Insert Non-Candidate Words in Table Cells.............................................................................................. 48

8.1.4������� Correcting Table Structure......................................................................................................................... 48

8.1.5������� About the Rubber-Banding Feature........................................................................................................... 49

8.1.5.1������ Auto-Scroll with the Rubber-Banding Feature.................................................................................... 49

8.1.5.2������ Add Column Data to a Table............................................................................................................... 49

8.1.5.3������ Insert Column Data............................................................................................................................. 49

8.1.5.4������ Replace Column Data.......................................................................................................................... 50

8.1.5.5������ Use the Rubber-Banding Feature........................................................................................................ 50

8.1.5.6������ Rubber-Banding Use Case One........................................................................................................... 50

8.1.5.7������ Rubber-Banding Use Case Two........................................................................................................... 50

8.1.5.8������ Rubber-Banding Use Case Three......................................................................................................... 50

8.1.5.9������ Rubber-Banding Use Case Four.......................................................................................................... 51

8.1.6������� About Correcting Single Cells................................................................................................................... 51

8.1.6.1������ Delete an Unnecessary Cell................................................................................................................. 51

8.1.6.2������ Insert a Cell......................................................................................................................................... 51

8.2������� About Table Extraction and Correction..................................................................................................... 51

8.2.1������� About Learning Lines................................................................................................................................ 51

8.2.2������� About Learning Column Mappings........................................................................................................... 52

8.2.3������� About Correcting Fields in Tables Created with Brainware Table Extraction............................................ 53

8.2.4������� Use the Standard Method for Table Extraction.......................................................................................... 53

8.2.4.1������ Step 1: Show the First Row Sample..................................................................................................... 53

8.2.4.2������ Step 2: Learn Mapping in the Row You Learned................................................................................. 54

8.2.4.3������ Step 3: Learn Missing Lines................................................................................................................. 54

8.2.4.4������ Step 4: Learn and Adjust the Mapping of Missing or Wrong Columns............................................... 55

8.2.4.5������ Step 5: Manually Correct the Table Data and Validate the Table......................................................... 55

8.2.4.6������ Step 6: Learn the Document................................................................................................................ 55

8.2.5������� Advanced Learning with Brainware Table Extraction............................................................................... 55

8.2.6������� Advanced Learning: Additional Functions................................................................................................ 56

8.2.6.1������ About the Unmap Column Method.................................................................................................... 56

8.2.6.2������ Undo Column Mapping...................................................................................................................... 56

8.2.6.3������ Learn a Block of Secondary Lines........................................................................................................ 56

8.2.6.4������ Unlearn a Line..................................................................................................................................... 56

8.2.6.5������ Learn a Line as a Wrong Line.............................................................................................................. 57

9���� Working with Learn Set Manager............................................................................................................... 58

9.1������� About Supervised Learning........................................................................................................................ 58

9.2������� Start Learn Set Manager............................................................................................................................... 58

9.3������� Getting Familiar with the Learn Set Manager User Interface................................................................ 58

9.3.1������� Accumulated Documents Browsing.......................................................................................................... 58

9.3.2������� Global Learnset Browsing......................................................................................................................... 59

9.3.3������� Learn Set Manager Keyboard Shortcuts.................................................................................................... 59

9.3.4������� Learn Set Manager Toolbar Buttons.......................................................................................................... 59

9.3.5������� Learn Set Manager Viewer Toolbar Buttons.............................................................................................. 60

9.4������� Using Learn Set Manager............................................................................................................................ 60

9.4.1������� About the Learn Set Manager Process....................................................................................................... 60

9.4.2������� Configure LearnSet Manager..................................................................................................................... 60

9.4.3������� Work with Common Learnsets.................................................................................................................. 61

9.4.3.1������ Correct Tables..................................................................................................................................... 62

9.4.3.2������ Reclassify a Document........................................................................................................................ 62

9.4.3.3������ Accept and Reject Documents............................................................................................................. 62

9.4.4������� About Sorting by Vendor and Other Sorting Extensions in Learn Set Manager......................................... 62

9.4.5������� Work with Global Learnsets...................................................................................................................... 63

9.4.6������� Train Base Classes..................................................................................................................................... 63

9.4.7������� Update Local Projects................................................................................................................................ 63

9.5������� About Using Learn Set Manager on Several Workstations.................................................................... 64

9.5.1������� About Batch-Level Locking....................................................................................................................... 64

9.5.2������� About Tracking Changes Made by Supervised Learning Managers.......................................................... 64

9.5.3������� About Project File and Learnset Locking During Training........................................................................ 65

10�� Frequently Asked Questions........................................................................................................................ 66

 

1      What Is Verifier?

WebCenter Forms Recognition is a product suite designed for automatically processing incoming documents. WebCenter Forms Recognition can process documents from arbitrary physical sources and paper-based documents as well as from electronic files.

1.1     What Happens When Verifier Process Documents?

Structured or unstructured document input is obtained by scanning paper-based documents or as files. All documents are stored on a computer�s hard drive. WebCenter Forms Recognition monitors specified directories on this hard drive for new documents. If new documents are detected, WebCenter Forms Recognition imports them.

Imported documents are first analyzed to determine the document layout and to recognize structures such as words, lines, logos, or tables.

The documents are then classified according to predefined categories. Examples of typical categories used in classification are invoices, orders, offers, or resumes. Categories can be defined individually, depending on the needs of your organization. Using a set of sample documents, WebCenter Forms Recognition learns to distinguish which category a previously unknown document belongs to.

For each category, the data relevant for further processing is different. For example, if you are processing invoices, you probably want to know the total sum to be paid. This information is irrelevant if you are processing resumes, where the applicant�s name, the desired position, and the contact options are more important. WebCenter Forms Recognition identifies and extracts data that is relevant for the respective document category. The data that is to be extracted can be defined individually to suit the needs of your organization.

Finally, the documents, their category assignments, and the extracted information are released from WebCenter Forms Recognition and written to designated export directories. The documents are then forwarded to connected systems. For example, invoices can automatically be forwarded to the software system used in your company�s accounting department, while resumes are sent to Human Resources.

All this is done without human intervention once the WebCenter Forms Recognition application has been set up. But what happens if WebCenter Forms Recognition cannot properly process a document? There are several reasons this could happen, for example:

  Paper-based documents might be unclear, so that WebCenter Forms Recognition is not able to read them.

  There might be stamps or notes on the documents that make important sections illegible for WebCenter Forms Recognition.

  WebCenter Forms Recognition may encounter a document from an unknown category. Since the software was not previously trained to recognize documents from this category, it will not be able to process the document.

  WebCenter Forms Recognition may have been defined to extract information that is missing, such as a form that was not filled in correctly.

That is where Verifier comes in. Verifier is the quality assurance utility of the WebCenter Forms Recognition suite. The application detects all documents with processing problems and presents them to the operator for verification.

Since the verification step is done before the export step, only qualified output will leave the WebCenter Forms Recognition process. Therefore, subsequent systems will only receive appropriate input.

The WebCenter Forms Recognition database platform enables you to keep a central store of your project and authentication information. This solution also allows for central management of storage and backup and thus provides for easier security, better connectivity of your applications, and higher flexibility for your personnel.

1.1.1     Verifier Key Features:

  Allows central WebCenter Forms Recognition database storage of projects and of user authentication information for more flexible access.

  Allows convenient correction of automatic classification results.

  Allows convenient correction of automatic extraction results.

  Allows manual indexing of documents.

  Allows semi-automatic indexing of documents by means of database lookups.

  Allows a final check of corrected documents before release.

1.1.2     Verifier Functionality Highlights

  Sophisticated status management and filter techniques show you only the documents you have to check and nothing else.

  During the application design, the user interface can be configured, providing optimum display options for each document category.

  Keyboard shortcuts are available for most operations.

  Through automatic locking, document batches can be processed by teams of operators.

1.2     Some Helpful Terms

This section provides some helpful terms when using Verifier.

Batch

A batch is just a stack of documents. Usually, this stack is not sorted. In the context of WebCenter Forms Recognition, batches consist of electronic documents. The documents inside such a batch may be paper-based documents that have been scanned to transform them into a digital format, or files created using applications such as a word processor. Various documents are normally assigned to the same batch only because they have been received within the same time period. For example, all letters received in the morning may be scanned until noon and therefore end up in the same batch.

Folder

In a business environment, folders are normally used to keep several documents together. WebCenter Forms Recognition does the same thing with folders. However, in the context of WebCenter Forms Recognition, a folder is always a structure inside a batch. This means that batches can either consist of document stacks, or they can consist of stacks of folders.

Document

A document is a piece of information that can serve as evidence of an event, situation, or business transaction. For example, a packing slip may provide evidence that an order has actually been shipped. Since people are used to working with paper, electronic documents strongly resemble paper-based documents. You will notice that WebCenter Forms Recognition documents consist of one or several pages, though the concept of a page is not really required for digital documents.

Classification

Classification means taking an unsorted stack of documents and organizing them into smaller stacks so that each stack contains only documents belonging to the same category. In other words, you start with a mess and end up with an organized stack of invoices, a second stack of resumes, a third stack of orders, and so on. Class and category are the same thing.

Indexing

Imagine you have a homogeneous stack of invoices, and you start to write out the information contained in the documents. For each document in the stack, for example, invoices, you will note the name of the supplier, the total sum to be paid, and the invoice number. This procedure is called indexing, and the information that was noted is the indexing information. Once you have finished, you file the invoices and use the indexing information to build your filing structure. Later, you will be able to search and identify the document with the help of the indexing information. In the context of WebCenter Forms Recognition, indexing information is applied to a set of fields associated with the document. For each document category, a different set of fields can be used.

Extraction

If you take the stack of invoices and again write out the name of the supplier, the total sum to be paid, and the invoice number, but this time it is done automatically, the procedure is called extraction. Extraction is a means for automatic document indexing. Extraction is context-sensitive; that is, the extracted information depends on the document category.

State

A state is a number that tells you how far the processing of a document has progressed. If the entire procedure of document processing consists of single steps, then the state increases with each step that has been completed. The state also indicates whether a step has been completed successfully, or whether there have been problems. In WebCenter Forms Recognition, states are determined hierarchically from the bottom up. If anything is wrong with a document, then there is also something wrong with the batch it belongs to.

Verification

Verification is a task related to quality assurance. It involves taking a document that has been processed or partially processed, checking the processing results, and correcting any errors.

Validation

Validation is another task related to quality assurance. Validation means confirming that a processing result is correct. This can be done at several levels:

  For the class or a field associated to a document.

  For the document as a whole.

  For an entire batch.

Learnset

In classification, a learnset is a group of documents whose classification is specified by a user. For each view and each class, the user must provide a sufficient number of representative documents. Similarly, in extraction, a learnset is a set of documents whose field contents are selected by the user from a set of candidates.

2      About WebCenter Forms Recognition Workflow

In WebCenter Forms Recognition, the flow of incoming documents follows a sequence of standard processing steps. One of the objectives of WebCenter Forms Recognition is to get documents to their recipients as quickly as possible.

Automatic steps are executed by the Runtime Server and include document import with batch creation, OCR and layout analysis, classification, extraction, export, and clean-up. These automatic steps are completed with two manual verification steps that ensure only high-quality output is produced:

  Verification of the classification step.

  Verification of the extraction step.

If the Runtime Server has completed an automatic step and the batch contains only valid results, the next automatic step can be accomplished without human intervention.

However, if the Runtime Server detects that the batch contains invalid results, the user can manually analyze and resolve the problem. Invalid batches are presented to you in a task list, called the Batch List. Finally, when WebCenter Forms Recognition has finished processing a batch, the documents are sent to their recipients.

2.1     Start and Exit Verifier

If Verifier was installed as recommended, you can launch it from the Windows Start menu as follows:

Start All Programs Oracle WebCenter Forms Recognition WebCenter Forms Recognition Verifier

After startup and login, the application displays the Batch List.

To quit Verifier, select Exit from the File menu.

2.2     About Logging into Verifier

When you log in to an existing project in Verifier, you must supply your user name and password. This password is not necessarily the same as the one you use to log in to your workstation. Instead, it is specific to Verifier, and possibly to the project. However, you probably have the same user name and password for all Verifier projects you work on. Your user name and password were assigned to you in Designer when your project administrator configured the project.

Your project administrator can give you the option to remember your user name and password between logons. This has been enabled if the Remember Password checkbox appears on the logon form. To remember your user name and password between logons, fill in your user name and password and check Remember Password before clicking OK. Next time you logon to the same computer, the system will fill in the user name and password automatically so that simply clicking OK will log you in.

When launching Verifier for the first time, the application is not yet configured. The Batch List will be empty and an error message displays. Verifier needs to be configured. This should be done by an experienced user.

2.3     About Specifying Login Information with Command Line Arguments

To suppress project authentication when starting Verifier, you can specify logon information as command line arguments. The command line argument for user name is /USR, and for password it is /PWD.

For example, the following line in a Windows batch file placed in the WebCenter Forms Recognition program folder launches Verifier under John Smith�s account:

start /B DstVer.exe /USR "John Smith" /PWD john1234567

You can use the same mechanism from the Windows Run menu, for example:

"C:\Program Files (x86)\Oracle\WebCenter Forms Recognition\Bin\bin\DstVer .exe" /USR "John Smith" /PWD john1234567

If the password is empty, there is no need to specify the /PWD option. For example:

start /B DstVer.exe /USR "Guest User"

The administrator can also review who has logged into the application by entering a certain script. Refer to the Oracle WebCenter Forms Recognition Scripting User�s Guide for more information.

3      Roles

Depending on the assigned role, Verifier users are able to complete the following tasks:

  Define, modify, and maintain the learn set.

  Collect and manage local training data.

  Propose learn set candidates to improve the performance.

  Verify the documents that Runtime Server could not automatically process.

  Access and change the batch filtering properties.

  Access and change the settings.

Refer to the Users, groups, and roles in the WebCenter Forms Recognition Designer User Guide for more information.

3.1     Change Your Password

To change your password, complete the following steps:

1.       Load Project in Designer application.

2.       From the Options menu, select Change Password� The Properties dialog box is displayed.

3.       Enter your existing password in the Old Password field.

4.       Enter a new password in the New Password field.

5.       Enter the new password again in the Confirm New Password field.

6.       Optional. Select the Decode Password option, which allows you to see the password you entered when changing password.

If Decode Password is unchecked, when you type the password, it is masked with asterisks.

7.       Optional. Select Update Password in Database, which is only available if the project administrator has enabled database authentication for the project.

8.       Click OK.

 

4      About Configuring Verifier

You can only change the Verifier settings if you have been assigned the Verifier Settings role.

Configuring Verifier entails specifying which batches of documents are processed at a given workstation. This includes the following:

  Sourcing of the batches either from the file system or from the WebCenter Forms Recognition database.

  The location of the batches in the file system.

  The project file that contains the settings used to process the documents.

  The processing steps that you want to verify, i.e. classification, extraction, or both.

  The status of batches before and after processing.

It also entails configuring 508 Compliance, but this is done at the workstation level, not the project level.

After you configure the Verifier settings, you can load and save them using commands on the File menu. When loading or saving a project, you can load or save a file with or without network data. When loading, click on the file type drop-down box and select either <project name> (*.sdp), or <project name> skip learn data (*.sdp).

Note:���� You can only work with Verifier after these settings are established. Only experienced users should change the settings.

4.1     Configure Verifier

To configure Verifier, perform one of the following actions:

From the Options menu, select the Settings� option.

Click the Settings button on the toolbar.

4.2     About the General Settings

The General tab is the place for general settings. It allows you to configure your referenced directories and files. Also, you can choose the WebCenter Forms Recognition database as your document and statistics source here.

4.2.1     About Specifying the Project File

The Use Project File option is used to select the path and file name of the WebCenter Forms Recognition project that processes the documents, and which contains the design of the verification forms that you use to verify the extraction.

  When you select a new project and click OK, the project loads after returning to the Batch List.

  The title bar indicates the currently loaded project.

  When you log in to Verifier, the system prompts you to let you know which project is loaded.

The Use Batch Specific Project File option is active, the project loads each time you open a batch. If you use this option, ensure that the Allow Database Authentication option is selected for all users in the Designer.

4.2.2     About Specifying the Directories

With the Use Database as Document and Statistics Source option, WebCenter Forms Recognition core information can be stored in the Forms Recognition database. Furthermore, you are able to select the job you want from the Select Job dropdown if you have selected the database as your source.

Note:���� The file system functionality is still supported, although use of the database is recommended.

The Display � Batches per Page option enables you to set the number of batches displayed per page, where valid values are from 1 to 200 batches to be displayed per page. This is only available if the Use Database as Documents and Statistics Source option is selected. The value of 50 is set by default.

The Batch Root directory is where the batch control files are located. This option is not available when using the WebCenter Forms Recognition database.

The Image Root directory is where subdirectories with the scanned images can be found. As a rule, the batch root and image root should be the same. In special cases, for security reasons for example, the image root can be different from the batch root.

4.2.3     About Specifying Client Settings

The Client option refers to the intent to use client-specific variables. Currently, only the default setting is available. In Designer, project administrators can define global variables for different clients. With the default entry, global variables do not vary by client.

4.2.4     About Specifying the Batch Options

If the Automatic Batch Refresh option is checked, the Batch List automatically shows newly generated batches with matching states. If you do not want the automatic update you can clear the checkbox. This leaves you the option to refresh the Batch List manually when you need up-to-date information.

The Create New Image File When Cutting Document option enables Verifier users to create new TIFF image files when a workdoc is split into multiples. The TIFF files correspond to the new workdocs.

Note:���� This feature is disabled when the Use Database option is selected. Any project that needs to use this feature must use the file system instead of the Forms Recognition database.

The Enable Cut Keeping Cover Page option enables Verifier users to cut a long document, such as a multipage fax, into several shorter documents while still retaining the cover page of the original workdoc as the cover page for each of the newly created, shorter, documents. If this is checked, the shortcut menu in Document Browsing view has additional menu entries. The new documents must then be OCR�d again.

4.2.5     About Fields Edit Mode

When a document is opened that requires correction or confirmation of extraction results, the cursor is automatically placed in the first invalid field. If you select Insert mode, the cursor is inserted to the left of the field contents. If you select Overwrite mode, the entire field content is selected.

4.2.6     About Specifying Tabbing Behavior

If you select the Tab Through Invalid Fields Only option, when a user presses the [Tab] key, [Shift] + [Tab], [Ctrl] + [Tab], or [Ctrl] + [Shift] + [Tab] to tab through the fields in Document Verification mode, the system tabs through invalid fields only. When the user presses the [Tab] key inside a table control, the system tabs through invalid table cells only.

4.2.7     Enable 508 Compliance

The Enable 508 Compliance option, on the General tab, activates 508 Compliance accessibility settings for your workstation. The Enable 508 Compliance option enables 508 Compliance for all projects you work with from this station. Users at other workstations who do not want to use these features do not have to use them, even if they work on the same projects you do. To enable the 508 Compliance options, complete the following steps.

When 508 Compliance is enabled:

  A blue arrow shows which field has focus.

  Additional visual indicators besides color highlighting help distinguish between invalid fields, valid fields, and questionable fields. These indicators are present in table fields and form fields. Green check marks show valid fields, red Xs show invalid fields, and orange question marks show questionable fields. Field candidates are highlighted in yellow, but do not have additional validity icons.

  All menu items have underscored letters available by [Alt] menu shortcuts.

  Pop-up menus for workflow state lists and exception handling can be activated by the right-click key on the keyboard. This key is on the right of the standard keyboard, in between the Windows key and the [Ctrl] key.

  In Show Selected Batch, the right-click keyboard key activates the shortcut menu for Append This Document to Previous One and Cut Pages into a New Document.

  During document verification, pressing [Ctrl] + [M] or selecting Show Selection Context Menu activates the shortcut menu for the currently selected item.

  In the highlight columns for interactive learning mode, unmapped column items are indicated by a blue rectangle without icons while valid and invalid column items are indicated by rectangles with a valid or invalid icon at the left side of every item.

If input focus is lost for any reason, the user can manually restore it from the main menu or by pressing [Ctrl] + [N].

4.3     About the workflow settings

The settings allow to define the workflow of documents that are processed by Verifiers. After each processing step, output states are assigned to batches, which distinguish success from failure.

4.4     About Exception Handling Settings

To specify what to do if the verification cannot be finished normally, select the Exception Handling tab.

A document with an unexpected error may not be suitable for verification. Moving the document into an exception state will flag the batch for issues.

Having a mechanism to handle unexpected failures allows operators to remove the batch from their task list. Then operators can manually assign special states to documents.

4.4.1     Select States

For each selected state, a menu command is available in the Verification View. The menu commands allow for case-specific handling of various types of unforeseeable errors. The description represents the menu command�s label. To select a state and set its description label, complete the following steps:

1.       On the Exception Handling tab, to enable a state, select the corresponding checkbox.

Note:���� The available exception states cover the range from 601 to 699. A batch state corresponds to the lowest document state within the batch. Routing batches using their exception state is only possible if the state for successful verification is greater than the one used for exceptions.

1.       To set the description label, right-click on the existing label and select New Description from the popup menu.

2.       Type the label into the corresponding field and confirm.

Note:���� The maximum length allowed for a description is 128 characters.

4.4.2     Configuring Exception Handling

The following settings are available for exception handling:

Before Moving a Document to an Exception State, Save It Automatically

Saves a document automatically before moving it to an exception state. This applies only to the respective current document.

Create New Batches with Documents Marked for Exception Handling

When this option is selected, the documents that are marked for exception handling will be moved to an exception batch.

  A batch is created for each exception code.

  The new batch receives a new batch ID.

  Documents from all verified batches are moved to the same exception batch in the Batch List.

  These batches can be released manually or automatically.

When this option is disabled, documents marked for exception handling stay in their batches. These batches keep their batch ID but are renamed according to the state description.

Automatically Release All Available Pending Exception Batches that contain N or more documents or older than M minutes

When this option is selected, an exception batch is released once it contains more than the defined number of documents, or is older than the defined number of minutes. This allows critical exception documents to be processed without waiting for manual intervention. Exception batches will also be released when user has exited the application and logged in again.

By Default, set exception mode to Batch

Selecting this option changes the scope of the command Move to Exception State on Options menu to Batch automatically.

Allow user selection of exception mode (Batch vs. Document)

This setting enables the dynamic changing of the exception mode on the Options menu, document verification view. By default, this option switches on. Use this option together with the option above to preserve a specific exception mode for the different user groups.

4.5     Quality Assurance with WebCenter Forms Recognition

To properly ensure the quality of automatically processed documents, there are two things you need to understand:

  Batches are the basic entity. WebCenter Forms Recognition works on batches and will completely process one batch before processing the next. Verifying and approving entire batches before routing to subsequent systems is an important step.

  A batch is valid only if all documents and processing results associated with the batch are valid. Because we are dealing with information and data, we do not use the terms working or damaged. Instead, we use the terms valid or invalid.

  A batch is invalid if one or more folders inside the batch are invalid.

  A folder is invalid if one or more documents inside the folder are invalid.

  A document is invalid if it has been classified automatically, but the classification result is invalid, or data has been extracted automatically from it, but at least one or more fields are invalid.

  A classification result is invalid if no matching class could be found, or the class has been changed manually and not yet validated.

  A field is invalid if it could not be filled, its content does not comply with validation rules that have been defined, or its content has been changed manually and not yet validated.

Field validation rules may be violated for a number of reasons:

  The set of allowed characters may be restricted.

  Only uppercase characters may be allowed.

  There may be restrictions on the number of characters the field can contain.

  WebCenter Forms Recognition may enforce that characters which could not be certainly identified during the OCR process must be checked. These questionable results are indicated in red and are underlined.

4.6     Configure Supervised Learning

The Supervised Learning tab is not available unless Supervised Learning was enabled for the project in by the administrator. To configure supervised learning in Verifier, with a WebCenter Forms Recognition database, complete the following steps:

1.       To enable Supervised Learning, on the Supervised Learning tab, select Activate Supervised Learning workflow.

2.       In the Local project name field, type or browse to the local project.

Note:���� Configure the local project with its own base directory (local learn set) and batch root.

3.       In the Knowledge base directory, type or browse to the common learn set directory.

Note:���� The common learn set updates whenever the local learn set is migrated to it.

4.       Optional. To push the local learn set to the common learn set, select Distribute Local Learnset to Knowledge base and complete any of the following sub steps:

Note:���� This option automatically includes any documents added to the local learn set into a queue for the Learnset Manager to review if the documents are appropriate for the global learn set. The knowledge base is often referred to as a queue of accumulated documents or common learn set pending for review by the Learnset Manager.

  To prevent the learn set from being trained locally, select Nominate for the learnset but never train locally.

  Select Use Database as knowledge source and then select a job from the Select Job list.

Note:���� To use this option, you need to have a job for a common learnset in the database. If this common learnset job is not available, create a database job in Runtime Server with the common learnset directory as the batch root.

5.       Always show state of all field locations after opening a document is reserved for future enhancements and not available yet.

6.       To create a learnset, select Apply local classification and extraction automatically.

Note:���� When no local project or learnset is used, the global project and global learnset is used instead.

Now the SLW user can perform classification and automatic extraction in Verifier. Also, the SLM user can now launch Learnset Manager to verify the learnsets from the Common Learnset in the Accumulated Documents Browsing mode.

7.       To receive a notification in case of a discrepancy between script and commands you ran to populate the learnset, select Prompt if script forces or rejects insertion to Learnset.

8.       Under Put document to Local Learnset, select one of the following options:

  To add a document to a learnset on a user�s request, select Only if adding activated by a user.

  To add documents automatically to the learnset if the specified threshold is exceeded, select Automatically if more than N% invalid fields, then type the threshold value in the % field, and optionally select Always prompt before adding to display a confirmation dialog box before the document is added.

9.       Under Learn new documents, select one of the following options:

  To initiate learning only when a user requests it, select Only by user request.

  To initiate learning for every batch in the project each time any batch is closed, select Before batch closing.

  To initiate learning anytime a document is added to the learnset, select Immediately.

4.6.1     Supervised Learning Settings

Supervised Learning is the interactive verification and training of learnsets. Selecting the Activate Supervised Learning Workflow checkbox enables Supervised Learning.

The individual Base Settings for Supervised Learning are described below:

Local Project Name

The file and pathname for the local project.

Knowledge Base Directory

The file and pathname of the common learnset. The common learnset will be updated whenever the local learnset is migrated to it.

Distribute Local Learnset to the Knowledge Base

Automatically adds any documents added to the local learnset into a queue for the Learnset Manager to review if the documents are appropriate to be added to the global learnset. The knowledge base is often referred to as a queue of accumulated documents or common learnset, pending review by the Learnset Manager for improvements into the project file.

Nominate for the Learnset but Never Train Locally

This option enables you to prevent the learnset from being trained locally.

Use Database as Knowledge source

Here you are able to select the desired job from the list. This list shows all batch jobs in the database, when the Use Database option is selected.

Always Show State of All Field Locations After Opening a Document

Not available in this version.

Apply Local Classification and Extraction Automatically

New classes will be created using the supplier�s name. A learnset should also be created if you select this setting. When no local project or learnset are used, global project and global learnset will be used instead.

Prompt if Script forces or Rejects Insertion to Learnset

You will be notified if there is a discrepancy between script and your commands regarding the population of the learnset.

Note:���� This information should be inherited from the settings your project administrator established in Designer.

The individual settings that comprise the Put Document to Local Learnset group are described below:

Only if Adding Activated by a User

If selected, a document will be added to the learnset only if the user requests it. This will be done automatically with no confirmation.

Automatically if More Than N% Invalid Fields

If selected, documents will automatically be added to the learnset if the threshold you set in the associated text field is exceeded.

The individual settings that comprise the Learn New Documents group are described below:

Only by User Request

Learning is initiated only when a user asks for it.

Before Batch Closing

Learning is initiated for every batch in the project each time any batch is closed.

Immediately

Learning is initiated anytime a document is added to the Learnset.

4.7     Advanced Settings

Select the Advanced tab for additional features. The following table provides a list of the Project File Updating settings:

Activate Project File Updating

Select this checkbox to activate the project file updating feature.

Source Project File Location

The path and file name of the source project file.

4.8     About Batch Filtering

The Batch Filter function enables you to specify filter conditions on which batches should be displayed. This is useful if you want to find a subset of batches in a large job, or to limit Verifier user activities.

The filtering dialog box is accessible outside the settings dialog box so that users without a SET role but with the FLT role are able to filter batches. Only users having the FLT role assigned will be able to configure filter conditions.

The saved filtering settings apply to the current batch list, and the application saves them for future sessions.

4.8.1     Configure Batch Filter Conditions

To configure batch filter conditions, complete the following steps:

1.       From the Options menu, select the Filtering� option, or click the Batch Filter toolbar button. The Batch Filtering Conditions Properties dialog box is displayed.

2.       Double-click on an entry in the left pane to select a batch attribute. The Filter Condition field is populated with the selected attribute

3.       Double-click on a filter condition in the right pane. The selected condition is added to the Filter Condition field.

4.       Complete the filter condition by editing the Filter Condition field as required. The following list describes the data types that are valid for each attribute:

  Batch ID:�������������������� String or number.

  State:���������������������������� Number.

  Priority:����������������������� Number.

  Name:�������������������������� String.

  Folders:����������������������� Number.

  Documents:��������������� Number.

  Client:�������������������������� String.

  Last User:������������������� String.

  Last Module:������������� String.

  Last Access:��������������� Date.

5.       Click OK to close the dialog box and apply the filter to the Batch List.

Note:���� To clear the batch filter, open the Batch Filtering Conditions Properties dialog box and click the Clear Condition button.

5      Configure Tasks to Perform at the Workstation

To specify the tasks that are to be carried out at the current Verifier station, complete the following steps:

1.       On the Verifier Properties dialog box, select the Workflow tab.

2.       Complete one or more of the following options:

  To configure document separation, ensure the Document Separation button is pressed.

  To configure classification verification at this workstation, ensure the Classification Verification button is pressed.

  To configure extraction verification, ensure the Extraction Verification button is pressed.

Note:���� These steps can be performed at the workstation.

5.1.1     Configure the input and output states

After you have selected the workflow steps to perform, establish values for the Input State and Output State for each enabled workflow step. To add an input value, complete the following steps:

1.       Right-click on the Input State list box and select Add State from the pop-up menu.

Note:���� You can also change states and delete states this way.

2.       To set an Output State value, select the value from the drop-down list to the right of the workflow step buttons.

 

5.1.1.1      About Verification Rules

The following verification rule options are available:

Verify Document for the Lowest Input Verification State Only

When this option is selected, the correction of the documents is grouped. After the verification of each input state, the user is asked to release the batch even if there are still documents with a higher input state left to be corrected. This option is valuable when you use several forms to verify extraction fields. If you have several forms defined for default processing (meaning that this option is not selected) all forms will be shown for the document that is corrected.

By Default Open the First Available Invalid Batch and Not the Selected One

If this option is selected, the first available invalid batch is opened, rather than the batch that the user selected from the batch list. The first invalid batch is selected based on Priority (higher first), user�s custom filter, sort settings, and the batch ID. This is for projects with large amount of batches and simultaneous Verifier users, and decreases time delay of project verification.

Perform Automatic Extraction After Classifying Documents Manually

When selected, this option forces WebCenter Forms Recognition to attempt to automatically extract data after the Verifier operator manually classifies the document. To select this option, the output state of the Classification Verification workflow step must be entered as an input state for the Extraction Verification input step.

Keep Showing Current Document After Saving

When this option is selected, Verifier displays the current document after it has been saved, instead of automatically displaying the next document.

Allow Immediate Copying of Selected Area to a Field or Table Cell

When this option is selected, Verifier allows copying of a selected area to a field or table cell when verifying. This may speed up the verification process by copying single words and candidates to verification elements.

 

.

 

6      Getting Familiar with the User Interface

6.1     Show Batch List

The first window displayed after starting Verifier is called the Batch List because it shows a list of batches. This is your work list.

To access the Batch List, select the Batch List option from the View menu, or click the Show Batch List toolbar button.

Note:���� If Verifier is not yet configured, the list of batches will appear empty.

6.1.1     Batch List Keyboard Shortcuts

The following shortcuts are available in the Batch List:

Keyboard Shortcut

Command

[Ctrl] + [1]

Batch List View.

[Ctrl] + [2]

Verification Mode.

[Ctrl] + [3]

Document Separation Mode.

[Ctrl] + [N]

Restore Focus.

[F5]

Refresh.

[Ctrl] + [E]

Release Exception Batches.

6.1.2     Batch List Toolbar Buttons

The toolbar provides quick access to some frequently used commands.

Button

Description

Display a property sheet where you can configure Verifier.

Display a dialog box where you can configure the batch filtering conditions properties.

If you click on the arrow to the right of this button, the available filters for the list of batches are displayed. You can select one of the following options:

   All Batches.

   Batches to Verify, Classification Only.

   Batches to Verify, Indexing Only.

   Batches to Verify.

Start the verification of the currently selected batch. Depending on the batch state, the batch is either displayed in the classification window or in the indexing window. From the list you can select one of the following options:

   Verify the selected or next batch.

   Verify the first invalid batch.

Display the batch structure of the currently selected batch. Selecting a document shows the Document List, which provides an overview of the documents inside the batch.

Start the Learn Set Manager application.

6.1.3     Navigation Toolbar Buttons

The navigation toolbar enables you to easily navigate through a large number of batches. You can also configure a number of batches to appear per batch page.

Button

Description

Go to the first batch page.

Go to the previous batch page.

Go to the next batch page.

Go to the last batch page.

Refresh the batch list.

6.1.4     Batch List Icons

For each batch, an icon indicates its status. When no icon is shown, the batch state is out of workflow. You can select another batch or change the settings for the workflow.

Symbol

Status

Batch is finished and ready for export.

Batch requires a correction of the classification results.

Batch requires a correction of the extraction results.

Batch is locked and unavailable, as it is in use by another application. Therefore it cannot be opened for correction.

Batch contains documents with exception statuses. When it is unavailable, it needs to be released before you can work on it again.

6.1.4.1      Batch List Columns

The batch list can be sorted by each column. The table columns display the following information about the batch:

Batch ID

A number that can be used to uniquely identify the batch.

State

An integer between 0 and 999 that indicates the progress of batch processing. The state also indicates whether the batch is ready for verification.

Priority

An integer between 1 and 9 that indicates how urgent it is that a job be finished, where 1 is the highest priority and 9 is the lowest.

Name

An arbitrary name that is easier to read than the batch ID. Because the name is optional, it might be missing or set to a default value.

Folders

Documents in a batch can be grouped in structures called folders. The value in this column indicates the number of folders inside the batch.

Documents

The value in this column indicates the number of documents inside the batch.

Client

N/A; as this column is not currently used.

Last User / Module

The computer name of the operator who has processed the batch before and the name of the application that most recently processed the batch.

Last Access

Displays the date when the batch was last processed.

External Group ID

Optional. The group ID which has been assigned to a batch relating to security. Batches can be assigned to user group via a unique ID.

This column is not displayed by default.

External Batch ID

Optional. The name of the batch group. This can be used to represent any piece of information you would like to associate with a batch. For example, an external system ID or storage box ID.

This column is not displayed by default.

Transaction ID

Optional. The transaction ID assigned to a batch. This allows the developer to synchronize a newly created batch of documents with another external system. It can be used to identify originators of batch of documents.

This column is not displayed by default.

Transaction Type

Optional. The transaction type assigned to a batch. This allows the developer to synchronize a newly created batch of documents with another external system. It can be used to identify the types of documents in batches or the source of the documents.

This column is not displayed by default.

Note:���� The External group ID, External Batch ID, Transaction ID, and Transaction Type columns do not display by default.

6.1.4.2      Sort in Batch List

You can sort any column in Batch List. To sort any item, click on the title of the column.

Batches sort according to their position on the list. If you select the first batch, and then click the Batch column label, it moves to the bottom of the list. For other items, the values toggle between ascending and descending order, whether numeric or alphabetical.

6.1.4.3      Select a Batch in Batch List

To select a batch in the table of batches, simply click on it. You can then move through the list using the following keyboard commands:

  To move to the first document, press the [Home] key.

  To move to the next document, press the [Down] cursor key.

  To move to the previous document, press the [Up] cursor key.

  To move to the last document, press the [End] key.

  To move one page down, press the [Page Down] key.

  To move one page up, press the [Page Up] key.

6.1.4.4      Leave the Batch List

To leave the Batch List and switch to another view, use one of the following keyboard commands:

  To verify the selected batch, press [Ctrl] + [2].

  To view the selected batch, press [Ctrl] + [3].

6.2     About the Document List

Document List displays the batch structure of the currently selected batch. Selecting a document provides an overview of the documents inside the batch. You can use Document List to investigate the documents in a selected batch.

6.2.1     Open Document List

To open Document List, click the Show Document List toolbar button.

6.2.2     Document List Keyboard Shortcuts

The following keyboard shortcuts are available in Document List:

Keyboard Shortcut

Command

[Ctrl] + [P]

Print.

[Ctrl] + [Alt] + [Home]

First document.

[Ctrl] + [Alt] + [Page Down]

Next document.

[Ctrl] + [Alt] + [Page Up]

Previous document.

[Ctrl] + [Alt] + [End]

Last document.

[Ctrl] + [8]

Append document.

[Ctrl] + [9]

Cut document.

[Ctrl] + [Enter]

Accept/reject next unsure age.

[Ctrl] + [Space]

Select next unsure page.

[Ctrl] + [1]

Display the Batch List.

[Ctrl] + [2]

Start the verification of the selected batch.

[Ctrl] + [3]

Display the selected batch in Batch List.

[Ctrl] + [N]

Manually restore input focus without using the mouse.

[Ctrl] + [+]

Zoom in.

[Ctrl] + [-]

Zoom out.

[Ctrl] + [Left]

Move image to left.

[Ctrl] + [Right]

Move image to right.

[Ctrl] + [Up]

Move image upwards.

[Ctrl] + [Down]

Move image downwards.

[Ctrl] + [R]

Rotate the image.

[Ctrl] + [Home]

First page in document.

[Ctrl] + [Page Down]

Previous page in document.

[Ctrl] + [Page Up]

Next page in document.

[Ctrl] + [End]

Last page in document.

[Ctrl] + [M]

Show selection context menu.

[Ctrl] + [Z]

Undo.

[F7]

Reclassify manually.

[F3]

Show last verified document.

[F8]

Get last value for selected field.

[F9]

Move to exception state.

[Ctrl] + [E]

Release exception batches.

[Ctrl] + [L]

Apply local extraction.

[Ctrl]+ [A]

Add document to learnset.

[Ctrl] + [T]

Correct tables.

[Ctrl] + [Q]

Switch table highlighting.

6.2.3     Main Toolbar Buttons

The toolbar provides quick access to some frequently used commands.

Button

Description

Display a property sheet where you can configure Verifier.

Display a property dialog box where you can configure the batch filtering conditions.

Display the Batch List.

Start the verification of the currently selected batch. Depending on the batch state, the batch is either displayed in the classification window or in the indexing window. A dropdown list allows users to verify the selected batch or verify the next invalid batch.

Display the available filters for the batch structure. You can select from among the following options.

   All documents

   Documents to Classify

   Index Documents to Classify

   Documents to Index

Start the Learn Set Manager application.

Display the first page of the selected batch, or a single page of the selected document.

Display the first two pages of the selected batch, horizontally, or the first two pages of the selected document.

Display the first three pages of the selected batch, horizontally, or the first three pages of the selected document.

Display the first two pages of the selected batch, vertically, or the first two pages of the selected document.

6.2.4     About the Batch Structure Area

In the batch structure, Verifier displays a hierarchical representation of the batch contents.

The levels of this hierarchy are:

  Batch.

  Folder.

  Document.

For each document entry, Verifier provides the following information:

ID

A number that can be used to uniquely identify the batch, folder, or document.

State

An integer value between 0 and 999 that indicates the progress of batch processing. The batch state is calculated from the states of its folders. It corresponds to the lowest value of all folder states. The folder state is in turn calculated from the states of the documents. It corresponds to the lowest value of all document states.

Name

An arbitrary batch or folder name that is easier to read than the ID. Because the name is optional, it might be missing or set to a default value.

Document Class

A document�s classification result. This entry might be missing if the document has not been classified.

6.2.5     Navigate in the Document List

To navigate in the batch structure, choose from among the following keyboard commands:

  To move to the first document, press [Ctrl] + [Alt] + [Home].

  To move to the next document, press [Ctrl] + [Alt] + [Page Down].

  To move to the previous document, press [Ctrl] + [Alt] + [Page Up].

  To move to the last document, press [Ctrl] + [Alt] + [End].

  To expand or collapse a folder, double-click on it, or click the plus sign or minus sign next to it.

6.2.5.1      About Splitting and Appending Documents

In the document list, you can split multipage documents into separate documents, with the exception of the first page of a document, which cannot be split. You can also merge consecutive pages of documents into one with multiple pages.

6.2.5.2      Split a Multipage Document

To split a multipage document, complete the following steps:

1.       From the View menu, select Show Selected Batch, then All Documents.

2.       Select Show document list.

3.       Click a multipage document you want to split.

4.       Right-click the second page, then select one of the following options:

  Select Cut Pages into a New Document to split the document into two documents.

  Select Cut pages into New Document Keeping Cover Page to split the document into several smaller documents and corresponding TIF files. Using this option includes the cover page of the original document as the cover page for the newly created documents.

Note:���� This option is not available until you have marked a page as a cover page. You can do this by right-clicking on the first page of the document and selecting Mark as Cover Page.

��������������� If you have changed the list sorting, such as switching the batch ID to descending order, the cut operation is not available.

5.       Optional. If you have changed the list sorting, complete one of these options when following prompt displays:

Would you like to switch back to the original sequence of the documents in batch?

  Click No to keep the sorting and disregard the cut operation.

  Click Yes to revert to the original sorting and cut the document.

6.2.5.3      Append Two Documents

To append a document to another, complete the following steps:

1.       From the View menu, select Show Selected Batch, then All Documents.

2.       Select and right-click the document to append to the previous document.

3.       Select Append This Document to Previous One.

Note:���� If you have changed the list sorting, such as switching the batch ID to descending order, the append operation is not available.

6.2.6     Viewer Toolbar Buttons

The viewer toolbar allows you to adjust the magnification used to display documents using the following commands.

Button

Description

Fits the document to window height.

Fits the document to window width.

Provides the best fit for an image.

Zooms in.

Zooms out.

6.2.7     About the Document Area

The document area shows the first page of the document that has been selected in the batch structure.

It is possible to default Verifier to display a specific page of each document instead of the first one. For more information, refer to VerifierFormLoad event in Oracle WebCenter Forms Recognition Scripting User�s Guide.

6.2.8     Print a Document

To print a document, from the File menu, select the Print option.

Note:���� This function is available in all modes of Verifier with the exception of Batch Browsing Mode.

6.2.8.1      Printing Verified Data Content

You can configure the amount of data on a printed form from the Print Setup dialog box. To access the Print Setup dialog box, select Page Setup� from the File menu.

The order in which the fields are printed is defined by the order of the fields configured in the project.

When a document prints, the header includes the document file name and the document class.

The following print setup options are available:

Print Image

When selected, also prints pages of the document file.

Print Form

Activates printing of the verification form.

Print Hidden Fields

When selected, Verifier prints not only the fields visible on the current verification form, but all the fields available in the loaded document.

Print Table Fields

When deselected, Verifier does not print table fields. Disabling this option might be useful for quick printing of documents with long tables.

Print column header on each printed table page

Enabled only if the Print Table Fields option is selected. When this option is selected, Verifier prints column headers on each page. This option is useful for printing long tables.

This option is selected by default.

Always show the dialog box when printing

Displays the Print Setup dialog box when users press print.

6.2.9     About the Verification View Classification Window

Verification involves taking a document that has been processed, or partially processed, checking the processing results, and correcting any errors.

When you open verification view, the classification window displays automatically if the next document that is to be verified requires a correction of the classification result. Whether this is the case depends on the state of the document.

6.2.9.1      Open the Verification View Classification Window

To display Verification View, complete the following steps:

1.       Select a batch from the list that requires verification.

2.       Click the Verify Selected Batch button. Alternatively, double-click the batch in the list.

6.2.9.2      Classification Window Keyboard Shortcuts

The following keyboard shortcuts are available in the classification window:

Keyboard Shortcut

Command

[Ctrl] + [Alt] + [Home]

First document.

[Ctrl] + [Alt] + [Page Down]

Next document.

[Ctrl] + [Alt] + [Page Up]

Previous document.

[Ctrl] + [Alt] + [End]

Last document.

[Ctrl] + [1]

Display the Batch List.

[Ctrl] + [2]

Start the verification of the selected batch.

[Ctrl] + [3]

Display the selected batch in Batch List.

[Ctrl] + [N]

Manually restore input focus without using the mouse.

[Ctrl] + [+]

Zoom in.

[Ctrl] + [-]

Zoom out.

[Ctrl] + [Left]

Move image to left.

[Ctrl] + [Right]

Move image to right.

[Ctrl] + [Up]

Move image upwards.

[Ctrl] + [Down]

Move image downwards.

[Ctrl] + [R]

Rotate the image.

[Ctrl] + [Home]

First page in document.

[Ctrl] + [Page Down]

Previous page in document.

[Ctrl] + [Page Up]

Next page in document.

[Ctrl] + [End]

Last page in document.

[Ctrl] + [M]

Show selection context menu.

[F3]

Show last verified document.

[F8]

Get last value for selected field.

[Ctrl] + [L]

Apply local extraction.

[Ctrl] + [J]

Increase image area.

[Ctrl] + [K]

Decrease image area.

[F7]

Reclassify manually.

[F9]

Move to exception state.

[Ctrl]+ [A]

Add document to learnset.

[Ctrl] + [E]

Release exception batches.

[Ctrl] + [T]

Correct tables.

[Ctrl] + [Q]

Switch table highlighting.

[Ctrl] + [Z]

Undo.

6.2.9.3      Verification View Classification Window Toolbar Buttons

The toolbar provides quick access to some frequently used commands.

Button

Description

Display the Batch List.

Verify the selected batch.

Display the Document List.

The scope of this command depends on the Exception Mode set on the Options menu. Two options are available:

   Move Document to Exception State.

   Move Batch to Exception State.

Clicking the arrow next to this button displays a list of exceptions. You can use these exceptions if you cannot correct a document at all, for example, because it belongs to none of the defined classes. Check with your administrator to determine which exceptions to use.

Note that in order to avoid selection conflicts, only the toolbar button provides a list of exception handling states to choose from. The selection made here will also apply if you move a document to exception state by selecting the appropriate option within the Options menu.

Fit the current image to the height of the window.

Fit the current image to the width of the window.

Fit the current image to the width or height of the window for maximum enlargement.

Zoom in.

Zoom out.

Display the first document in the batch.

Display the previous document in the batch.

Display the next document in the batch.

Display the last document in the batch.

Rotate the current document clockwise.

Display the first document in the batch and switches the application to Browsing Mode.

Display the previous document in the batch and switches the application to Browsing Mode.

Display the next document in the batch and switches the application to Browsing Mode.

Display the last document in the batch and switches the application to Browsing Mode.

Display the first page of the document if the current document has more than one page.

Display the previous page of the document if the current document has more than one page.

Enter a page number in order to navigate directly to it. All invalid entries, for example, alphabetical characters and page numbers out of range, are ignored, and the page number is reset to the currently displayed page.

Display the next page of the document if the current document has more than one page.

Display the last page of the document if the current document has more than one page.

6.2.9.4      About the Class Selection List

This box shows the classification result of the current document. If you open the list, you see all available classes.

The list entries represent the classes assigned to the current project or user, and are controlled by the Verifier Classify script event.

If no result could be determined, the box shows as empty.

6.2.9.5      Set or Change a Classification Result

To set or change a classification result, make sure that you are not in browsing mode, and then perform one of the following actions:

  Click on the arrow on the right side of the list box to open the list, and then select a class.

  Use the cursor keys to browse through the list of classes. The entries in the list are sorted alphabetically.

  If you know the correct class name, you can type its first characters and wait until the system automatically displays the full class name.

6.2.9.6      Select a Class Result Using Advanced Classification

To select a class result using advanced classification, complete the following steps:

1.       Click the Classification Matrix button.

Note:���� This button is not available unless advanced classification has been enabled for the project by the administrator in Designer.

A list opens containing one or more classes if the result could not be determined with 100% certainty. If more than one class is in the list, the class entry is determined by probability. The class with the highest probability is at the top of the list.

2.       Select a class for the current document and then click OK.

6.2.10   About the Verification View Indexing Window

The indexing window displays fields and documents specific to your organization. The layout of the window can be customized by a project designer.

The indexing window automatically displays the next document that requires a correction of the extraction result.

6.2.10.1    Open the Verification View Indexing Window

To display verification view, complete the following steps:

1.       Select a batch from the list that requires verification.

2.       Click the Verify Selected Batch toolbar button.

6.2.10.2    Increase or Decrease the Image Area

To increase or decrease the image area, complete the following step:

Drag the vertical split bar between the image area and field area either to the right or left.

Note:���� This option is only available in the extraction verification view.

6.2.10.3    Indexing Window Keyboard Shortcuts

The following keyboard shortcuts are available in the indexing window:

Keyboard Shortcut

Command

[Ctrl] + [P]

Print.

[Ctrl] + [Alt] + [Home]

First document.

[Ctrl] + [Alt] + [Page Down]

Next document.

[Ctrl] + [Alt] + [Page Up]

Previous document.

[Ctrl] + [Alt] + [End]

Last document.

[Ctrl] + [1]

Display the Batch List.

[Ctrl] + [2]

Start the verification of the selected batch.

[Ctrl] + [3]

Display the selected batch in Batch List.

[Ctrl] + [N]

Manually restore input focus without using the mouse.

[Ctrl] + [L]

Apply local extraction.

[Ctrl]+ [A]

Add document to learnset.

[Ctrl] + [J]

Increase image area.

[Ctrl] + [K]

Decrease image area.

[F9]

Move to exception state.

[Ctrl] + [E]

Release exception batches.

6.2.10.4   Verification View Indexing Window Toolbar Buttons

The toolbar provides quick access to some frequently used commands:

Button

Description

Display the Batch List.

Verify the selected batch.

Display the Document List.

Starts the Learn Set Manager application.

Click the down arrow next to the button for a list of exceptions. You can use these exceptions if you cannot correct a document at all, such as when the required data is illegible. Please check with your supervisor to determine which exceptions to use.

Marks all areas on the current document that have been used to fill the fields. If the result is valid, the area is highlighted in green. If the result is invalid, the area is highlighted in red.

Marks only the area on the current document that was used to fill the field that is currently selected in the field area. If the extraction result is valid, the area is highlighted in green. If the extraction result is invalid, the area is highlighted in red.

Marks the area that was used to fill the field that is currently selected in the field area. This area either appears in green or in red. In addition, all other areas that were taken into account to fill this field are highlighted in yellow.

Fit the current image to the height of the window.

Fit the current image to the width of the window.

Fit the current image to the width or height of the window for maximum enlargement.

Zoom in.

Zoom out.

If this button appears pressed down, the application always displays the document area that is associated with the currently selected field.

Keeps the established zoom settings on each document you view in the batch.

Display the first page of the document if the current document has more than one page.

Display the previous page of the document if the current document has more than one page.

Enter a page number in order to navigate directly to it. All invalid entries, for example, alphabetical characters and page numbers out of range, are ignored, and the page number is reset to the currently displayed page.

Display the next page of the document if the current document has more than one page.

Display the last page of the document if the current document has more than one page.

Rotate the current document clockwise.

Display the first document in the batch and switches the application to Browsing Mode.

Display the previous document in the batch and switches the application to Browsing Mode.

Display the next document in the batch and switches the application to Browsing Mode.

Display the last document in the batch and switches the application to Browsing Mode.

Applies local classification and extraction to the current document.

Adds current document to local learnset.

Starts document learning.

Starts table correction.

6.2.10.5   Support for the Mouse Wheel

Verifier supports the mouse wheel when validating documents in Document Verification mode. Mouse wheel rolling has the following effect depending on where the mouse cursor is or where the keyboard focus is:

Case

Wheel Rolling Effect

Input focus is in a multi-line header field.

Scrolls between lines of the header field.

Input focus is in a single line header field or at the first line / row of any field (scrolling up only) or at the last line / row (scrolling down only).

Scrolls the entire verification form.

Input focus is in a table field.

Scrolls between table rows or between multiple lines of the currently selected table cell (when multi-line).

Mouse pointer is in the Document Viewer area.

Scrolls the currently viewed page image up and down.

6.2.10.6   Form Area

A form has three main elements: a label, a viewer, and a field.

Labels

Labels are captions that help users to identify form fields, as well as viewers and tables.

Viewer

A viewer contains snippets of document areas, normally those that were extracted to fill fields or tables.

Fields

A field will display data and allow for entering or editing of data. Fields might be either a text field, table field, check box, list box, or a Yes/No selection. You can use fields to create check boxes and combo boxes.

6.2.10.7   Field Area

In the field area, the following icons are used to indicate the nature of the field:

Button

Description

Indicates the currently selected field.

Indicates a valid extracted field.

Indicates a field that needs to be validated because it was extracted with low confidence.

The following list explains the field types:

1.       A user cannot edit or select a Read Only field.

2.       An Auto-completion field enables the user to edit text by typing the first few letters of a word until best matching candidate appears.

3.       A Multi-line field enables line wrap and displays a vertical scroll bar. This field type is a requirement for address analysis.

4.       A List box contains a selection list to verify an item in a document.

5.       Check boxes are toggle selections for data input that derive from form fields.

6.2.10.8   Navigate the Field Area

To navigate the field area, select one of the following options:

  Use the mouse. This method does not affect the validation state of a field.

  Press the [Tab] key. This method gets you to the next field, but not to the next document. This method does not affect the validation state of a field.

Note:���� The order that the [Tab] key moves through the form is part of the form�s design.

  Press the [Shift] + [Tab] key to go to a previous field.

  Press the [Enter] key. This method validates the entire field or the next invalid character within a field. Once the field is corrected, it is validated and the focus moves to the next field that requires correction. This field may also be within another document.

6.2.10.9    About the Verification View Document Area

The document area shows the currently selected document or page along with highlights.

  Red areas indicate an invalid result.

  Green areas indicate a valid result.

  Yellow areas were considered as candidates, but another candidate seemed more likely. If the extraction result is invalid or wrong, these areas may point to the correct indexing data.

Note:���� In practice, red, green, and yellow areas never appear in the same document.

6.2.10.10 Navigate the Document Area

To navigate the document area, choose one of the following options:

  To highlight the entire document table, click the square in the upper-left corner of the table field.

  To highlight a document column, click the column label of a table field.

  To highlight a document row, click the row label of a table field.

  To highlight a document cell, click the cell of a table field.

Note:���� Valid areas are green and invalid are red. These areas may also contain validity icons, which are green check marks for valid fields or red Xs for invalid fields.

6.2.10.11 About Tables in the Document Area

Only one table will display per verification form, even if you are able to define multiple tables. However, you can display different tables on different forms.

If you only need to verify certain columns in a table, you can make the other columns invisible. All invisible columns must be valid for the entire table can be valid.

For a large document with many line items, you can detect and view the location of all the extracted line items that are currently shown within a table field.

6.2.10.12 About the Current Input Area

The current input area provides a large editing box and shows the following enlarged information for the currently select field.

  A snippet that shows an enlargement of the document area that was used to fill the field.

  The extracted data. Color coding is used in the same way as in the field area. You can edit the data here.

6.2.10.13 About the User Information Area

The user information area is at the bottom of the Verifier window, and consists of three fields that display the following information:

  The name of the currently selected field.

  If the current field is invalid, the reason is displayed. If the current field is valid, the field is normally empty.

  The classification result of the current document.

7      About Working with Verifier

7.1     Manual Correction of Automatic Page Separation

If, during the automatic page separation process in Runtime Server, there was at least one unsure page-level decision for a batch of documents, the whole batch receives the state Failed Page Separation. Such a batch is supposed to be manually reviewed and, if required, corrected in Verifier.

You can correct the automatic page separation results in the Document Browsing mode of Verifier. When the next batch is opened, the system automatically displays the first unsure split or merged page.

The available options for automatic page separation are listed below:

Toggle the Unsure Status

Select the Accept / Reject Next Unsure Page menu command or click [Ctrl] + [Enter]. This command sets the page to the manually accepted state or to the manually rejected state, respectively. There are three different states of page correction status: blue page icon for extracted with high confidence by the engine, blue page icon with a red question mark for extracted with low confidence by the engine (unsure) and blue page icon with green check sign for manually accepted / corrected by the Verifier user. These states are retained after the user closes the batch in Verifier and can be reviewed by other users. If all pages of a document become accepted (the pages extracted with high confidence are accepted by default), the document is redirected to successful page separation state. If at least one of the document�s pages becomes manually rejected, the entire document receives the lowest page separation failed state that is configured in Verifier settings.

Split the Document into Two Separate Documents

Select the Cut Document menu command or press [Ctrl] + [9]. The top document receives all the pages above the selected one while the bottom document receives all the pages below, including the selected page. In this case, the page correction status is automatically applied to the selected page and the preceding page. If you split previously merged documents, the original document names are restored.

Merge the Selected Document with the Previous One

Select the Append Document menu command or press [Ctrl] + [8]. In this case, the first page of the selected document and the last page of the proceeding one are accepted for a manual page correction status.

Go to the Next Unsure Page

Select the Next Unsure Page menu command or press [Ctrl] + [Space]. This action selects the next unsure page to verify (the one with red question mark) without changing the state of any pages.

7.2     About Manual Correction of Classification Results

Manual correction of classification results is done if the Verifier workstation is configured with the following settings:

  Classification verification is enabled.

  Extraction verification is disabled.

To determine your settings, check the Workflow tab of the Verifier Properties dialog box.

If you do this task regularly, you may want to apply the appropriate filter in the Batch List. From the View menu, select the Batch Filter option, then select Batches to Verify, Classification Only.

7.2.1     Manually Correct Classification Results

To correct invalid classification results, complete the following steps:

1.       In Batch List, check the state column to find a batch you can verify.

2.       Open the selected batch in the Verification View.

3.       The Verification View opens in Verify Mode, with the first invalid document being displayed. The cursor is already placed in the classification list box.

4.       To select a class, either:

  Click on the arrow on the right side of the list box to open the list and then select a class.

  Use the arrow keys to browse the list of classes and make your selection. The entries in the list are sorted alphabetically.

  If you know the correct class name, type its first characters and wait until the system automatically displays the full class name.

5.       To confirm your selection, press the [Enter] key. The application validates the document and its state increases. The next document requiring verification is displayed automatically.

When all documents in the batch are validated, the application prompts you to release the batch. Click the Yes, No, or Details button, as appropriate. Clicking on Details reveals more options:

  Verify Next Invalid Batch on the List releases the current batch and opens the next batch that needs verification.

  Close Batch and Return to the Batch List releases the current batch and displays the Batch List, where you can select the next batch.

  Verify This Batch with the Next Verification Form changes verification forms and displays the next verification form.

7.3     About Processing Documents with an Obsolete Class

Deleting a class will make the class obsolete. Often with Supervised Learning workflow, classes become obsolete because the global project�s configuration deletes or just does not insert the class.

The only way to process obsolete document classes is if the class still exists in the project that the document is processing with. Information saves internally about the former parent class assignment, which makes it possible to process obsolete document classes.

7.4     About Manual Correction of Extraction Results

Manual correction of extraction results will be done if the Verifier workstation is configured with the following settings:

  Classification verification is disabled.

  Extraction verification is enabled.

To determine your settings, check the Workflow tab of the Web Verifier Settings or the Verifier Properties dialog box.

If you do this task regularly, you may want to apply the appropriate filter in the Batch List. From the View menu, select the Batch Filter option, then select Batches to Verify, Extraction Only.

7.4.1     Correct Invalid Results

To correct invalid results, complete the following steps:

1.       In Batch List, check the state column to find a batch you can verify.

2.       Open the batch in the Verification View.

The Verification View opens in Verify Mode and the first invalid document displays. The application places the cursor in the first invalid field and the user information area contains a message indicating why the field is invalid.

7.4.1.1      Form Elements and Field Types

A form can include the following elements:

Form Fields

Display extracted data. You can also enter and edit data during manual indexing. You can use form fields to create checkboxes and combo boxes.

Labels

Identify form fields, viewers, and tables.

Viewers

Are sections of document areas, normally those that were extracted to fill fields or tables.

Buttons

Fire actions for a new script event.

Tables

Extracted from documents.

The following is a list of field types and their description:

Read Only

When selected, information on a field is dimmed and cannot be selected or edited.

Auto-Completion

Enables you to edit text in a field by typing the first two letters of a word. Auto-completion finishes the word with the best matching candidates.

Multi-line Fields

Required in the context of address analysis but can also be useful in other cases. A multi-line field enables line wrap and displays a vertical scroll bar, if required.

List Box

A dropdown box that lists predefined strings related to the verification document. It can either show the nearest values automatically or show only selected values.

Checkbox

A toggle selection for one of two choices of the data input for a field, for example, yes/no.

7.4.1.2      About Editing Text Fields

Verifier and Web Verifier include automated features for editing text fields that can speed up text entry and correction.

You can use automatic character entry, when the auto-completion is enabled in the form field Properties dialog box, to edit text fields and cells. Other options for character changes include multi-line fields, combo boxes, and checkboxes. You can also insert and replace text in table cells and fields, either in single words or blocks of text, using drag-and-drop or by double-clicking on the selected text.

Multi-line fields are necessary for address analysis but can also be useful in other cases. A multi-line field enables line wrap and displays a vertical scroll bar, if required. To add a new line to a multi-line field press [Ctrl] + [Enter].

A combo box lists predefined strings related to the verification document. To aid in verification, you can select from the list of strings.

The checkbox provides a binary option that toggles table data entry choices on and off. For example, with a Yes or No checkbox, checking Yes would bring up data entry related to the verification and unchecked for No would hide them.

7.4.1.3      About Auto-Completion

Auto-completion helps to speed up typing. When you start to type, auto-completion completes the word, suggesting the best match among all of the words or candidates available after OCR and format analysis.

For example, you can type the first two characters of a 20-character invoice number. The auto-completion feature finds the best matched candidate suggested by the format analysis engine and places it in that field. The auto-completion feature for a header field automatically selects the best candidate from the available ones if Verifier is in Highlight Candidates mode. However, the viewer will be updated only if the candidate appears once in the document; otherwise the viewer will be blank when auto-text completes the word for the field.

The automatically selected text also appears highlighted in the original document. Select whether a single-line or a multi-line text field should be displayed. To override auto-completion, continue typing the desired text.

Note:���� Auto-completion does not work on formatted text and characters incorrectly read by OCR.

7.4.1.4      About Inserting Words in Fields

To speed up verification, you can insert words to replace or append text.

The method for inserting words depends on the availability of candidates. A candidate is one that matches the learned words for that field. It will appear in green, with a border of green check marks if that visual indicator is enabled in Batch Options, when you select it after selecting the field. Non-candidates will display in orange when selected. You can insert words in fields or table cells. You can append or insert words and use the mouse to append or replace the field.

To use the Append feature, the selected word must appear on the same line as the existing cell text. If not, the selected word will replace the existing cell text.

Words that are candidates for cells: If the word belongs in a cell area, you can append or replace a word in a cell. The Append feature takes the current word behind the candidate and appends the cell text. It places the text in the best location, either to the right or left of the word, and in the cell location based on the text or location of the word. The word belonging to a cell area displays in green when selected. Or, you can replace the text. In the search region, word candidates are all words that are not covered (by location) by other table cells and that have the same beginnings as the whole text of the cell.

Words that are not candidates for a cell: If the word does not belong to cell areas, it displays in orange when selected. Even if it is not a candidate, you can append or replace the word. Appending places the text in the best location, either to the right or left of the word, by text or location of the word. Or you can replace the cell text and location by the text and location of a word. For example, a cell named C2658 might be appended by "number" or you can replace the cell text and location by the text and location of a word.

7.4.1.5      Use a Word that is a Candidate for a Field

If a word is a candidate for a field, you can append or replace the word in a field box. A candidate is a word that matches the learned selections for the field. To copy text to the field box, complete the following actions:

  Click on the text you want to copy. A box appears around the word.

  Double-click the box or right-click in the document and select Copy to Current Field.

Note:���� You can insert only one candidate per field per document verification session.

7.4.1.6      Use a Word That Is Not a Candidate for a Field

Even if the word does not belong to any candidates for the field, you can append or replace a word with a new one. Appending places the text in the best location, either right or left of the word, by text or location of the word. Or you can replace the field text and location by the text and location of a word. A word that does not belong to any candidates for that field will display in orange when selected. To use a word that is not a candidate for field, complete one of the following options:

  To append text with the new text, drag a box around the desired word. Double-click on the word in the box or right-click in the document and select Append Field Text by Word.

  To replace text, select the word with the mouse. A box will appear around the word. Double-click it, or select Replace Field Text by Candidate in the shortcut menu.

Note:���� You can insert only one candidate per field per document verification session.

��������������� Make sure that this word fits the format analysis rules defined for that field. If not, the word is highlighted in orange (and with a border of orange exclamation marks if validity icons are enabled) to help distinguish it. If so, it would not be a good candidate for the field.

7.4.1.7      Insert Blocks of Text

Inserting large blocks of text with minimal mouse movement is helpful when you have multiple word data verification elements for fields such as address information or cell descriptions. Before you can insert blocks of text, first select the settings in the Workflow dialog box to immediately copy information. To insert large blocks of text, complete the following steps:

1.       In the image viewer, drag over the text.

2.       Optional. Use the handles to adjust the selection.

3.       Drag the selection to the field or table cell.

4.       Optional. Double-click on the selection. The selected text replaces the text in the field or table cell.

7.4.2     Finish the Validation

1.       After you correct a field, press the [Enter] key to validate it.

During validation, the field�s background color appears in yellow, and the cursor becomes an hourglass. Once the validation is finished, the cursor moves automatically to the next invalid field regardless of whether this field is in the same invalid document, or in the next invalid document. If you leave a document this way, it is validated automatically. In the next field, proceed as described above. When all documents in the batch are validated, the application prompts you to select what to do next.

2.       Click Yes or No, or click Details. Clicking Details reveals the following options:

  Verify Next Invalid Batch in the List releases the current batch and opens the next batch that needs verification.

  Close Batch and Return to the Batch List releases the current batch and displays the Batch List where you can select the next batch.

  Verify This Batch with the Next Verification Form changes verification forms using the next verification form.

7.5     About Manual Correction of Classification and Extraction Results

Simultaneous correction of classification and extraction results is available if your workstation is configured with the following settings:

  Classification verification is enabled.

  Extraction verification is enabled.

  Automatic extraction after classification is disabled.

Note:���� If you do this task regularly, you may want to apply the appropriate filter in the Batch List by selecting Batch Filter and then Batches to Verify from the View menu.

7.6     About Smart Indexing

Organizations usually collect information about themselves and everybody they do business with. Much of this information is stored in databases. Databases can be an excellent support for indexing because they store related information that can easily be retrieved. During indexing, if you have extracted one piece of information from a document, you can obtain related pieces from the database and fill the associated fields automatically. This method is called smart indexing.

Normally, smart indexing is combined with manual indexing. Some fields of a form have to be filled in manually; some fields can be filled automatically.

For example, assume that your organization saves information related to orders in the database of its ERP system. Every order is characterized with a unique identifier and some attributes about the supplier and the items that have been ordered. Soon after an order is placed, the ordered items are delivered, and a delivery note is attached. The corresponding invoice follows soon. The delivery note and the invoice refer to the original order. They have the order�s unique identifier printed on them. With this identifier, you can look up supplier information from the database when you verify the delivery note and invoice. However, new information such as the invoice date has not yet been entered into the database. This information can be supplied manually.

7.6.1     Use Smart Indexing

To use smart indexing, complete the following steps:

1.       Smart index fields can be recognized by the key icon that is displayed next to them. Select a smart index field. The field itself and all the fields that can be filled through the database lookup are marked with a yellow database icon.

2.       If the field is still empty, enter the field value. Alternatively, enter a wildcard expression, using an asterisk to represent a sequence of characters or a question mark to represent a single character.

3.       Complete one of the following options to start the lookup:

  If your application is configured accordingly and the field content is correct, validate the smart index field by pressing the [Enter] key.

  Press [Alt] + [F12].

4.       The system may respond with one of the following options:

  If the lookup yields no results, a corresponding message is displayed. Fill the lookup fields manually. If you cannot complete the fields, send the document to exception handling.

  If the lookup yields one result, the lookup fields are filled.

  If the lookup yields multiple results, and this is allowed in your application, the lookup fields are filled.

  If the lookup yields multiple results, and this is not allowed in your application, a dialog box is displayed where you can select the correct record. The lookup fields are then filled accordingly.

7.7     Check Entire Batches

To browse through all documents in a batch, complete the following steps:

1.       In Batch List, use the status value to determine a batch you can browse through.

2.       Open a batch in Verification View. The first document that requires correction is automatically displayed.

3.       To display the first document in the batch, press [Ctrl] + [Alt] + [Home] or use the appropriate toolbar button.

4.       You may encounter a document that has been classified incorrectly. To correct this result, press the [F7] key to open the classification window. To correct the class, select the corresponding entry from the list box at the bottom, then confirm by pressing the [Enter] key.

5.       This displays the indexing window again.

6.       To correct extraction results, type your corrections into the corresponding field. If a field has been changed, its state is set to invalid. Press the [Enter] key to validate the field you modified, and then press [F3] to return to the document.

7.       To get to the next document, press [Ctrl] + [Alt] + [Page Down] or use the appropriate toolbar button.

8.       Repeat the above steps as appropriate until you reach the last document.

8      About Working with Tables

You can correct invalid cells the same way you would correct an invalid text field.

Additional methods to simplify manual table extraction, such as the Correct Table, are available.

8.1     About Manual Training and Correction Methods

In the case that automatic table extraction fails to recognize the line items properly, Verifier provides several ways for convenient manual table correction.

8.1.1     About Auto-Completion with Tables

Auto-completion works in table cells and with text fields. When you type two or more characters, auto-complete suggests a word or phrase for that cell.

The candidate appears in green if the field is valid and red if the field is invalid. If the visual validity icons are enabled in Batch Options, valid fields also have a border of green check marks and invalid fields have a border of red question marks. This function only works with Highlight Candidates mode.

8.1.2     Insert Candidate Words in Table Cells

You can insert single words or append existing text in table cells. To insert words in table cells, complete the following steps:

1.       To select the text you want to insert in a table cell, complete one of the following actions:

  Double-click on the word.

  Right-click on the word in the image viewer and select the respective option.

  If you have candidates, double-click the desired candidate to replace it.

2.       To append text with the new text, select Append Cell Text by Word.

3.       To replace text with the new text, select Replace Cell Text by Word.

8.1.3     Insert Non-Candidate Words in Table Cells

Even if the word does not belong to any candidates for the cell, you can insert single words, append or replace existing text in table cells. To insert words in table cells, complete the following steps:

1.       To append text with the new text, double-click the word or right-click in the image viewer.

2.       Select Align & Copy to Current Field.

3.       To replace text, select the word, and then select Copy to Current Field in the shortcut menu.

8.1.4     Correcting Table Structure

You may need to correct the table structure. Table rows, cells, and columns have shortcut menus with options for modifying the table structure. To invoke them, right-click on the row, cell, or column label.

The available commands are summarized in the table below:

Shortcut Menu

Command

Description

Column

Unmap

Clears all data for the selected verification column and turns the state of the corresponding column of the recognized table back to unmapped. To view an unmapped column, double-click on the table header in the verification form. All unmapped columns are highlighted in red.

Column

Map

Adds the column selected from the shortcut menu, or you can right-click on an unmapped column to map it to a column in the verification form.

Column

Swap

Exchanges the position of the current column and the one selected from the dropdown menu.

Row

Insert

Inserts an empty row above the current one.

Row

Delete

Deletes the current row.

Row

Duplicate

Duplicates the current row.

Row

Append

Appends an empty row at the bottom of the table.

Row

Merge

Merges cells in a row.

Cell

Insert Cell

Inserts an empty cell above the selected cell while shifting the cells below down.

Cell

Delete Cell

Deletes the selected cell. The cells below are shifted up.

8.1.5     About the Rubber-Banding Feature

The rubber-banding feature allows you to select a block of data on a document and place this block of values at a particular point within the table.

Note that the Table Correction mode must be switched off for the menus described in the following sections, though a mixed usage of manual and engine driven table correction is possible.

8.1.5.1      Auto-Scroll with the Rubber-Banding Feature

To auto-scroll when using the rubber-banding feature, use one of the following options:

  If the target document area is displayed only partially, such as due to the zoom level, move the mouse outside the document area while rubber-banding to scroll the document and to select the entire desired data.

  If you want to re-size a rubber-banded area, drag the corner of the rubber-band rectangle. The window will auto-scroll if you want to select values outside of the visible area.

8.1.5.2      Add Column Data to a Table

There are two ways to add column data to a table:

  Insert column data.

If whole rows are missing after extraction, use this option for better accuracy. Note that the application observes the relationship between the columns and maps the values appropriately. Already extracted cell values would be shifted up or down by the insertion creating new rows.

  Replace column data.

This is very comfortable when at least one column has been extracted by 100%, and other table columns contain random values. With this option, the column data can be added by blocks, overwriting the previously extracted entries.

8.1.5.3      Insert Column Data

To insert column data, perform one of the following actions:

  To insert column data above a correctly extracted and filled cell, click into the filled cell and select Insert Column Table Data from the menu. This creates additional rows and shifts already existing rows up. At the same time, the values will be automatically assigned to already extracted values of other columns if available.

  To insert column data below a correctly extracted and filled cell, click into the next empty cell below and select one of the two options depending on whether or not you want to keep already extracted cell entries.

8.1.5.4      Replace Column Data

To replace column data, click into the desired starting point cell and select the Replace Column Table Data option.

Provided that the rubber-banded area spans more lines than already contained in the table, the additional subsequent lines will be added as additional rows continuing the table.

8.1.5.5      Use the Rubber-Banding Feature

To use the rubber-banding feature, complete the following steps:

1.       Place the cursor into the destination cell within the table.

2.       Draw a rubber-band rectangle around the column data on the image within the Document Viewer.

3.       Right-click the selection on the image and click either Insert Column Table Data or Replace Column Table Data in the popup menu, depending on your task.

8.1.5.6      Rubber-Banding Use Case One

A use case for the rubber-banding feature is if columns of one data type are split up and placed side-by-side, or stacked, on a page. To correct such a table, complete the following steps:

1.       Place the cursor into the first line of the column.

2.       Draw a rubber-band rectangle around the first block of column data on the document.

3.       Right-click the selection and click Insert Column Table Data.

4.       To append further column data, right-click the row�s node of the last table row and select Append Row.

5.       Place the cursor into the next empty cell of the desired table column.

6.       Draw a rubber-band rectangle around the next data block of the same column type and select Insert Column Table Data.

7.       Proceed the same way with other columns to fill the table.

8.1.5.7      Rubber-Banding Use Case Two

A use case for the rubber-banding feature is if column items have been extracted only partially. To correct such a table, complete the following steps:

1.       Place the cursor into the already extracted cell to mark the starting point for the insertion.

2.       Draw a rubber-band rectangle around the column data on the document.

3.       Right-click the selection and select Insert Column Table Data from the popup menu.

4.       Now continue mapping the other column data.

8.1.5.8      Rubber-Banding Use Case Three

A use case for the rubber-banding feature is if column items are missing from the neighboring column. The following example shows how to insert missing values:

1.       Place the cursor into the Reference cell.

2.       Perform the steps as described in Use Case Two.

8.1.5.9      Rubber-Banding Use Case Four

A use case for the rubber-banding feature is if you have documents where the data columns appear misaligned. To correct this situation, map the column data in blocks, as described for previous use cases.

8.1.6     About Correcting Single Cells

You may need to correct the table structure if for instance an unnecessary cell has been mapped to the table or if a missing cell has to be added.

You may have documents where one of the line items is missing. During extraction, the values from below might be shifted up to fill the empty space.

For this, you have the possibility to add or remove single cells.

8.1.6.1      Delete an Unnecessary Cell

To delete a cell, complete the following steps:

1.       Click the cell within the table to place the cursor in it.

2.       Right-click and select Delete Cell (Shift Cells Up) from the popup menu.

8.1.6.2      Insert a Cell

To insert a cell, complete the following steps:

1.       Click the cell within the table that is subsequent to the cell candidate and place the cursor in it.

2.       Right-click and select Insert Cell (Shift Cells Down) from the popup menu.

This creates an empty cell within the table above the selected cell. Now, you can copy the desired value from the document into the newly created cell.

8.2     About Table Extraction and Correction

The learning process for the Brainware Table Extraction engine consists of two phases:

  Learning lines

  Learning mappings of columns

These are discussed in detail in the following sections.

Note that functionality is available for the Supervised Learning Verifiers. With the Generic Table Extraction, no extra learning is needed.

8.2.1     About Learning Lines

The Brainware Table Extraction engine considers the following main types of the lines:

Primary Line

A line that defines table structure. The engine applies advanced and precise similarity analysis for all primary lines. It is important that all primary lines are well-structured and that they look similar in many of the rows to extract. The engine easily supports an unlimited number of types of primary lines for one table definition. The primary line must contain at least four words. Otherwise, the engine will not learn it. In addition, the primary line must be the first line in the table row.

Secondary Line

A line between primary lines. The engine applies smooth similarity analysis for these types of lines, which is possible because Brainware Table Extraction only searches the area between two neighboring primary lines. This allows the engine to extract data that varies widely, which often happens with multi-line descriptions. There is also no limit to the number of words in secondary lines, and no limit to the number of secondary lines. However, a document's page must have at least one primary line; otherwise, secondary lines on this page are not extracted.

Wrong Line

A primary line that is learned as a negative line sample. In other words, all lines classified by the engine as members of one particular wrong line class are not extracted. In principle, it is possible to learn an unlimited number of wrong lines, though the current restriction is that this will only take effect during in-document learning. Cross-document learning (that is, learning the whole document after all the fields are completely valid) may not automatically train the wrong lines.

After it learns any type of line, the Brainware Table Extraction engine automatically creates and manages a new line class (cluster). Afterward, all lines in the document considered by the engine to be members of the line class (similar to the learned line sample) will be extracted, or not extracted in the case of wrong lines.

It is possible to learn an unlimited number of different line classes. However, the overall quality may suffer if too many lines are learned.

Learning lines can be applied in lines learning (or lines highlighting) mode. Mapping of the column data in the lines can be done in column mapping learning (or columns highlighting) mode. The user can switch between learning (highlighting) modes with the Switch Table Highlighting menu option in the Options menu, or with the context menu options Show Lines and Show Columns.

8.2.2     About Learning Column Mappings

When learning the mapping for columns, the user trains the engine on how the data from the extracted lines must be mapped to the user's table data.

For primary lines, this mapping can be defined differently for different line classes. For example, if a user learned two different line samples that went to two different lines classes internally in one document, the user can then map Unit Price in the document to the Unit Price data column, and the Total Price to the Total Price for the first line sample. For all lines of the second line type, the user can map Unit Price to Total Price, and Total Price to Unit Price. For the next document, the Brainware Table Extraction engine will always use the first set of mapping rules for the lines classified to the first line type, and the second set of mapping rules for the lines classified as the second line type.

If you have several Brainware Table Extraction tables in one class, the learnset is shared between these tables. In other words, if you used interactive learning for one Brainware Table Extraction table, cross-document learning (which happens if the system added the document to the learnset after document validation) is applied for all Brainware Table Extraction tables in the document.

8.2.3     About Correcting Fields in Tables Created with Brainware Table Extraction

Any time you train a table interactively, complete the required training first and then manually verify the table.

Brainware Table Extraction can train line types and column mapping for each type of line.

When working with interactive table extraction, learn the lines before you map the columns.

Because of the way interactive table verification works, you cannot manually delete data from a cell. Rather, if you want to discard cell data, un-map the column and re-extract the table to remap the column. Although it will seem as if you deleted the data, the data is still there until you un-map the column.

8.2.4     Use the Standard Method for Table Extraction

This section describes the simplest way to use interactive Brainware Table Extraction learning. If this method does not work, proceed to the advanced method described in the following sections. To use the standard method, complete the following procedures:

1.       Show the first row sample.

2.       Learn mapping in the row you learned.

3.       Learn missing lines.

4.       Learn and adjust the mapping of missing or wrong columns.

5.       Manually correct the table date and validate the table.

6.       Learn the document.

8.2.4.1      Step 1: Show the First Row Sample

1.       Select your Brainware Table Extraction table by clicking any table field inside the table grid.

2.       Click the Correct Tables button.

3.       In the lines highlighting mode, use the Learn as Row function to show the row sample. This function automatically learns the first line as a primary line and the rest of the lines as secondary lines. This function is also available by double-clicking on the selected row area. Select the whole first row and learn it.

Note:���� The visual indicators for valid, invalid and questionable table lines are as follows: valid lines are highlighted green, invalid lines are highlighted gray, and questionable lines are highlighted blue.

4.       Optional. To learn a new line as a primary line, complete the following actions:

  Right-click on any line marked in gray in the Image Viewer.

  On the popup menu, select Learn Line.

  The learned lines change from gray to green, or to blue if the line is extracted with low confidence.

5.       Optional. To learn a block of lines as primary lines, complete the following actions:

  In the Document Viewer, draw a rectangular selection over the primary lines.

  Right-click on the selection.

  On the shortcut menu, select Learn as Primary Line(s).

  All correctly selected primary lines will be learned and highlighted in green (or blue if the line is selected with low confidence), and all other lines will be similarly extracted and displayed.

Note:���� If some lines were not extracted, try relearning the lines singly or in a block.

6.       Optional. To learn a lines block as a table row, complete the following actions:

  In the Document Viewer, draw a rectangular selection over the required multi-line (or single-line) table row.

  Double-click or right-click on the selection.

  From the popup menu, select Learn as Row.

If some lines were not extracted, repeat the procedure described above.

Do not try to learn the rest of missing secondary or primary lines now. This is because mapping is defined on the basis of line type. If you would train all different line samples now, you would need to learn the columns mapping separately for every line class. In order to reduce time to train the table, first learn the column mapping for the row you just learned. If you then want to learn another line sample, the engine will apply the existing mapping rules for the newly learned row automatically.

Green highlighting indicates a line is extracted with high-confidence; blue highlighting - with low-confidence. If the confidence for a blue line is less than 0.3 (moving the mouse cursor over the highlighted lines shows the confidence value as a tool-tip) then the lines will not be extracted.

Blue highlighting has also the following important meaning: this line can be trained by the engine as a new line class.

All correctly selected primary lines will be learned, and all other lines are similarly extracted and displayed.

8.2.4.2      Step 2: Learn Mapping in the Row You Learned

1.       Switch to the columns highlighting mode now (using [Ctrl] + [Q]) and mark the location of your first cell item in the row you learned.

2.       The system displays a special mapping control asking for the data column to extract the data to.

3.       Select the required data column by double-clicking on it.

4.       Repeat this step for the rest of the cell items in the first row.

8.2.4.3      Step 3: Learn Missing Lines

1.       Switch back to the lines highlighting mode.

2.       Mark the next missing row and learn it as described before.

3.       Repeat steps 1 and 2 for all rows on all pages where something is missing. Go to the next step only after you are sure nothing is missing.

8.2.4.4      Step 4: Learn and Adjust the Mapping of Missing or Wrong Columns

1.       Return to columns mapping learning mode and look for wrong or missing mapping. Correct any missing mapping.

2.       If you can�t map the missing columns, switch back to the lines highlighting mode and try to learn the row where the mapping is missing.

3.       Switch to columns highlighting. If the mapping is still missing, mark the missing part and map it.

Note:���� The Brainware Table Extraction engine may determine the mapping automatically.

4.       Repeat steps 1 - 3 until the data is completely extracted or cannot be learned correctly.

Note:���� There is always a chance that you will not get 100 percent extraction results.

8.2.4.5      Step 5: Manually Correct the Table Data and Validate the Table

Switch to cells highlighting mode and manually correct missing data, OCR errors, and so on.

Optional. Click the Add Current Document to Local Learnset button to add the document to the Learnset and then learn it, which is known as cross-document learning. Do this if the system did not suggest learning the document automatically.

Note:���� The only requirement for cross-document learning is correctness and completeness of the table data to train. This means that location, content, and format of every cell item is accurate.

8.2.4.6      Step 6: Learn the Document

After table learning and validation are complete, and the rest of the document�s fields are validated, you may want to add this document to the learnset and then learn it. This is cross-document learning, in contrast with in-document interactive Brainware Table Extraction learning.

If the system did not suggest learning the document automatically, but you still would like to learn your table, activate learning by clicking the Add Current Document to Local Learnset toolbar button.

Note:���� The only requirement for cross-document learning is correctness and completeness of the table data to train. This means that location and content of every cell item should be correct. Also, ideally, the content of cell items should not be formatted.

8.2.5     Advanced Learning with Brainware Table Extraction

This section discusses the special cases in which it is necessary to use secondary lines explicitly. There are two such cases:

  The table row begins on one page and ends on the next.

If a table row begins on one page and ends on the next page, you must use the Learn as Secondary Lines function (in lines learning mode) to train missing secondary lines (on the next page). In this case, these secondary lines are placed right before the first primary line on the page. Mark all the secondary lines as before: right-click and select Learn as Secondary Lines.

Never use the Learn as Row function in this case, as this tells the engine that the first secondary line is actually a new sample of primary line. As a result, the engine may split extracted table data into new rows.

  Learning of unmapped secondary lines leads to unwanted extraction.

Your project may require that data from secondary lines not be extracted. Usually, this will not be a problem, but sometimes the engine extracts the data from these lines anyway. In this case, not learning these secondary lines will prevent unwanted extractions. Use the Learn as Secondary Lines function instead of Learn as Row if you would like to learn just selected lines and not all lines that belong to the row. You can also Unlearn Line to correct or adjust the extraction.

8.2.6     Advanced Learning: Additional Functions

8.2.6.1      About the Unmap Column Method

The Unmap Column method can undo mapping for the specified cell item.

This will undo mapping for all cell items that were extracted from the lines that belong to the same line type as the cell item used to invoke the Unmap Column method.

8.2.6.2      Undo Column Mapping

To undo incorrect column mapping, complete the following steps:

1.       Right-click on any unassigned column (highlighted in blue) or draw a rectangular selection over the cell items to be mapped to a table column.

2.       On the shortcut menu, select Undo Mapping.

The previously assigned column (highlighted in red) is now unassigned. The values are no longer extracted or in the table grid.

8.2.6.3      Learn a Block of Secondary Lines

To learn a block of secondary lines, complete the following steps:

1.       In the Document Viewer, use the mouse to draw a rectangular selection over the required secondary lines of a desired multi-line row.

2.       Right-click on the selection.

3.       On the popup menu, select Learn as Secondary Line(s).

All correctly selected primary lines will be learned and highlighted in green (or blue if the line is selected with low confidence) and all other lines are similarly extracted and displayed. If some lines were not extracted (these lines will not be color-coded), repeat the procedure described immediately above.

8.2.6.4      Unlearn a Line

The Unlearn Line function can be used to discard previously applied learning for a particular line. To do this, Brainware Table Extraction uses a line sample, searches for the line type, and removes the line type from the learnset. To unlearn a line, complete the following steps:

1.       Switch to Lines Learning mode and right-click on the line you want to unlearn.

2.       On the shortcut popup, click Unlearn Line. Unlearned lines change from green to gray.

8.2.6.5      Learn a Line as a Wrong Line

Learning a wrong line means to train the table such that a particular line will not be extracted. This applies to other lines of the same type in the table. To learn a line as a wrong line, complete the following steps:

1.       Right-click on any learned line or draw a rectangular selection over the required lines.

2.       On the popup menu, select Learn as Wrong Line. The selected lines and similar lines to it are now highlighted in gray. Information from these lines will not be extracted.

9      Working with Learn Set Manager

9.1     About Supervised Learning

The basic purpose of Learn Set Manager is to use Supervised Learning to improve the quality and usefulness of your enterprise�s Learnsets.

With Supervised Learning, Supervised Learning Verifiers, and LearnSet Managers, you can customize your project�s learnsets by adding or subtracting documents, reclassifying them or creating altogether new classes or learnsets, and migrating documents there. They can also promote local learnsets to a global learnset so that it can be shared across the enterprise.

In general, Learn Set Manager consists of:

  Creating new classes based upon documents themselves and supplier information.

  Learning documents and adding them to the local learnset.

  Using the local learnset to improve the extraction of low-quality documents.

  Maintaining local learnsets.

  Updating and enhancing the global learnset with information from the local learnsets.

Although Supervised Learning was created for use with vendors� invoices, it can also be used with other types of knowledge. For example, a library might create classes based on type of material, subject matter, or author. Most of the illustrations and examples in this chapter use invoices.

Learn Set Manager can only be launched from Verifier. You have access to Learn Set Manager mode only if you have been assigned to a group that has permission to work with the mode. There is no limit on the number of users who can simultaneously access Learn Set Manager.

9.2     Start Learn Set Manager

To start Learn Set Manager, complete the following step:

  Click the Start Learn Set Manager button in the Verifier toolbar.

9.3     Getting Familiar with the Learn Set Manager User Interface

Learn Set Manager has two basic modes, or views. These are the Accumulated Documents View, where you work with local learnsets, and the Global Learnset View, where you work with common learnsets and global learnsets.

When you are working with local learnsets, you can further refine the appearance of the Accumulated Documents browser when you verify documents or manually reclassify them.

9.3.1     Accumulated Documents Browsing

The Accumulated Documents Browsing view has the following sections:

Batch Viewer

Enables you to see each class in the batch you are working on. The Batch Viewer shows each class as part of the batch, the user who created the batch, the date it was created, the number of documents in the batch and in each class, the number of documents successfully classified and the number of documents successfully extracted. You will need to enlarge the window to see all of these categories.

Document Viewer

As with the Document Viewer in Verifier, this window enables you to see (and therefore verify) each document in the batch you are working on. You can verify documents in Document Viewer.

Learning Statistic Window

Shows the documents that have been processed by WebCenter Forms Recognition. Documents that are awaiting processing have a question mark beside them. Successfully processed documents have a check mark, while documents that failed processing have an X.

9.3.2     Global Learnset Browsing

The Global Learnset Browsing view has the following sections:

Batch Viewer

Enables you to see each class in the batch you are working on. Shows the classes in the global learnset, and the number of documents classified or extracted in each.

Document Viewer

As with the Document Viewer in Verifier, this window enables you to see (and therefore verify) each document in the batch you are working on. You can verify documents in Document Viewer.

Learning Statistic Window

Shows the documents that have been processed by WebCenter Forms Recognition. Documents that are awaiting processing have a question mark beside them. Successfully processed documents have a check mark, while documents that failed processing have an X.

9.3.3     Learn Set Manager Keyboard Shortcuts

The following menu commands and keyboard shortcuts are available in Learn Set Manager:

Keyboard Shortcut

Command

[Ctrl] + [E]

Closes Learn Set Manager.

[F5]

Verifies documents in Supervised Learning.

[F7]

Manually reclassifies document.

[F1]

Opens Learn Set Manager help.

9.3.4     Learn Set Manager Toolbar Buttons

The Learn Set Manager toolbar provides quick access to some frequently used commands:

Button

Description

Show settings.

Switch to Accumulated Documents Browsing (the local learnset).

Switch to global learnset processing.

Verify documents.

Correct tables. Allows you to correct data in the tables. You have to click a table field for this to be active.

Accept documents.

Reject documents.

Learn documents (add them to the global learnset).

9.3.5     Learn Set Manager Viewer Toolbar Buttons

On the Viewer toolbar, you can use the following commands to adjust the size of a document relative to the width of the Document Viewer window:

Button

Description

Fits the document to window height.

Fits the document to window width.

Best fit.

Zooms in.

Zooms out.

9.4     Using Learn Set Manager

9.4.1     About the Learn Set Manager Process

Use Learn Set Manager to work with local learnsets. First, verify the documents, decide whether they belong in the learnset, add them to the common learnset, and train the learnset.

In the common learnset, you examine the documents for inclusion in the global learnset, accept or reject them, and add them to the global learnset. Finally, you train the global learnset.

9.4.2     Configure LearnSet Manager

To configure LearnSet Manager, complete the following steps:

1.       In Verifier, examine the properties for LearnSet Manager. Most were established in Designer. However, you need to ensure that Learn Set Manager is enabled, that the Activate Supervised Learning Workflow checkbox is checked. Also, ensure that the paths for Local Project Name, Local Learnset Directory, and Knowledge Base Directory are correct. If you are using the Forms Recognition database, you will select a database job instead of specifying a Knowledge Base Directory path.

2.       Launch the Learn Set Manager module from Verifier by clicking the Learn Set Manager button on the toolbar.

3.       On the Learn Set Manager toolbar, click the Settings button.

4.       Review the settings as follows:

Show Learned State Using Engine-Level Information

Make sure this option is checked. This setting indicates whether the particular field or document was used by the system for learning. If required, a user can also disable learning for the desired field/document.

Automatically Reject Document if Number of Pages Exceeds

Documents with more pages than specified in the appropriate field will be prevented from being added to the learnset.

Inherit Verifier Settings

If this option is selected, the settings made on the Verifier settings tab will be applied, and the options below will be grayed out.

If you want Learn Set Manager to us a different global project, or a different job containing data, then clear this option, and populate the options below as to your needs.

Use Database as Document and Statistics Source

WebCenter Forms Recognition core information can be stored in the Forms Recognition database.

Select Job

You are able to select the desired job from the Select Job dropdown if you have selected the Forms Recognition database as your source

Accumulative (common) batch root path

This option to set the path to the batch root is not available when using the database.

Automatic Backup

Select the files you want to be backed up automatically.

  Project file.

  Project learnset.

  LSM train set.

9.4.3     Work with Common Learnsets

To work with common learnsets, complete the following steps:

1.       When you launch Learn Set Manager module from Verifier, the Accumulated Documents Browsing Mode will be shown by default. This is the mode you use to work with common learnsets, and it is activated when you click the Switch to Accumulated Documents Browsing toolbar button or when you select Accumulated Documents Browsing from the View menu.

2.       In the Accumulated Documents Browsing Mode, select a batch to work on.

3.       Double-click on a class to select it.

4.       Select a document to work on and verify the document just as you would in the traditional Verifier. Click the Advanced Verifier Mode toolbar button to correct or verify the contents of each field and table.

5.       After you have verified the document, click the Accept button. This marks the document for learning as the first step for promotion into the global learnset. You could also click the Reject button to eliminate the document from being considered for the global learnset.

6.       Select another document from the batch by accessing the Learn Statistics Panel at the bottom of the screen, where you�ll double click on a document to open it in the Document Viewer and work on it.

7.       When you have verified all the documents you need to verify, click the Learn Documents button. This promotes all the accepted documents to the global learnset. Notice that the Learn Statistics window for the local learnset is now empty.

9.4.3.1      Correct Tables

To correct tables, complete the following step:

  Select a table in a document and then click the Correct Tables button.

This enables Supervised Learning Managers and Verifiers to interactively train all the tables on a document form, not just the table you selected so you could activate the Correct Tables button. From there, table correction in Learn Set Manager proceeds just as it does in Verifier.

9.4.3.2      Reclassify a Document

To reclassify a document, complete the following steps:

1.       To assign a document to a different class, select the Document menu from the main menu and select Reclassify. Alternatively, press the [F7] key.

2.       This opens a dialog box in the Verifier Document Viewer where you can assign the document to a new class.

3.       Select the new class from the list and press the [Enter] key.

Note:���� Manual reclassification will only succeed with the classes that Verifier is currently using; not the classes that have already been learned.

9.4.3.3      Accept and Reject Documents

There are two ways to accept or reject a document or batch from a learnset. The first is the traditional way, by using Verifier to manually screen and verify the document or batch. The other method is by comparing documents in the common learnset to the corresponding batch in the global learnset.

Note:���� If you enable the Automatically Reject Document if Number of Pages Exceeds� option in the LearnSet Manager settings, documents with more pages than specified in the appropriate field will be prevented from being added to the learnset. The user is notified when a document exceeds the number of allowed pages, and the process continues by choosing Yes in the pop-up windows.

9.4.4     About Sorting by Vendor and Other Sorting Extensions in Learn Set Manager

Learn Set Manager can sort by vendor name across multiple batches produced by different local supervised learning Verifiers. This simplifies Supervised Learning Workflow decisions as to which documents to train for a specific vendor class. With the help of this feature, the user can now review all documents (for the same vendor class) created by different Verifier workstations at once.

From the View menu, select Sort Batches by Vendor, and the system rebuilds the batches of documents created through multiple sessions by multiple Verifier workstations, and allows the Learn Set Manager user to sort by vendor name. In this case, each vendor folder accumulates all available documents for this vendor, so that the user could select the best documents to train the global project with.

The Created On and Created By data fields are displayed separately for each particular document in the Document List at the bottom of the Learn Set Manager window.

Note that in the Global Learnset browsing mode, the options to sort batches by vendor and sort batches by date are disabled. Both of these sorting options are enabled only in Accumulated Document browsing mode. However, in Global Learnset browsing mode, the vendor classes are sorted in alphabetical order. Furthermore, under the Display Class Name column, the class name is shown only for the document classes that already exist in the global project.

9.4.5     Work with Global Learnsets

The global learnset is where you further refine the quality of your data before migrating it into an effective and useful global knowledgebase. To work with global learnsets, complete the following steps:

1.       Click the Global Learnset Viewing button.

2.       Begin your work at the document level by examining the quality of the data in the document. As before, you select the document from the Learning Statistics window at the bottom of the screen.

3.       Right-click the document if you are satisfied with it, and then select Enable.

4.       Select Use to Train Base Classes.

5.       Click the Learn button.

6.       Confirm that you do want to learn the document by clicking Yes.

7.       After you have verified each field in the document, click the Accept button on the toolbar to accept the document for processing, or click the Reject button to reject it.

8.       Now retrain the learnset by clicking the Learn Documents button on the toolbar.

9.4.6     Train Base Classes

The final milestone in creating or enhancing your global learnset is to train base classes. To train a base class, complete the following steps:

1.       On the Options menu, select Train Base Classes.

2.       Select the base class to train.

3.       Under Train Selected Base Classes, select a value. To avoid errors while maintaining the quality of your sample, select the lowest value possible.

4.       Click OK.

9.4.7     Update Local Projects

The ability to update local projects is important for keeping your learnsets synchronized. During the work with Verifier, the global project�s learnsets are constantly updated.  An administrator may then wish to update the local out-of-date projects with a new global project.  The administrator adds all the local projects into the main list, points to the global project template for overwriting, and press Update to update them.

To update local projects, complete the following steps:

1.       On the Options menu, click Update Local Projects.

2.       Configure the selections described below and click Update. This procedure can take a while, especially if you are updating projects on a network. Note that locked projects will not be updated. The Update Local Projects dialog box specifies a list of local or network project paths to be managed. Here, you can:

  Add project paths to the Update list by clicking Add and browsing to the project.

  Remove paths from the Update list by clicking Remove and browsing to the project.

  Change existing project paths by clicking Change and browsing to the project.

  View a history of all Verifier workstations that have connected to the common learnset. The History shows the workstation name, the time, and date of its last connection and the local project path.

This list is updated every time a Verifier station creates a new batch of locally learned documents in the common learnset.

3.       Refresh the list of projects to see the most recent update information about them.

4.       Empty the template cache. The Empty Template Cache Project field is used to update the local project.

5.       Update the list of projects or save the new criteria without actually updating the projects.

For each configured path, the dialog box shows whether a project is up to date, whether it is locked, and whether the project is available. An up-to-date project has a green check mark beside it; a project that has not been updated has a red X. Path names of unavailable projects are dimmed. Note that the settings you establish above will be available for any workstation on which Learn Set Manager is opened.

9.5     About Using Learn Set Manager on Several Workstations

Learn Set Manager can be used simultaneously on more than one workstation, thanks to the application�s ability to lock projects and files.

Learn Set Manager supports three levels of protection that facilitate this ability:

  Batch-level locking.

  Allowing all Learn Set Manager workstations to view changes made by all Supervised Learning Managers.

  Locking project files and learnsets while they are being trained.

9.5.1     About Batch-Level Locking

Batches are locked when they are in process. This prevents several users from updating the same batch at the same time. No one else can access the batch until processing is completed and the batch is closed.

9.5.2     About Tracking Changes Made by Supervised Learning Managers

The changes applied by one Learn Set Manager user should be visible to all other Learn Set Manager users. To accomplish this, Learn Set Managers must use two predefined batch document states:

  981:������� Accepted.

  982:������� Rejected.

When learning is executed, the Learn Set Manager application checks to see if documents with either of these assigned states have been added to the local learnset. Documents with a state of 981 are added to the global learnset. Documents with a state of 982 are not added to the Global Learnset.

9.5.3     About Project File and Learnset Locking During Training

Only one workstation can do learning at one time. This means that the learning process is locked, and therefore not available for other users, if one user has initiated learning.

10   Frequently Asked Questions

The frequently asked questions and their answers below may help to resolve some situations that you might experience during the extraction and validation process.

Q.     In one of my batches, there is a document that must be classified manually, but it does not belong to one of the available classes. I cannot release the batch as it is. What can I do to finish my job?

A.     Normally, your organization will have specialized workstations where people are in charge of handling special cases that only occur as exceptions.

Q.     In one of my batches, there is a document I have already validated. However, I�ve overlooked a mistake in this document. I don�t want to release the batch without correcting it.

A.     You can use the Document Mode to get to the document. Select the document and switch to Verify Mode. Make corrections and press the [Enter] key.

Q.     Sometimes the indexing window looks unusual. It has no field area, only the current input area. How do I get to the next field?

A.     This is not a problem. You can use all keyboard shortcuts for field navigation from within the current input area.

Q.     When I switch from one field to the next, the document is not moving as well. I find this annoying. Is there a way to stop that?

A.     The application always searches the document area associated with the current field�s content. This area is then displayed. To turn this off, click on the Keep Focus button in the toolbar. Alternatively, you could just use a different magnification ratio.

Q.     I tried to start Learn Set Manager, but I do not want to have to go through Verifier first. Can I do it?

A.     No. Learn Set Manager is an add-in that can only be started in Verifier.

Q.     I tried to start Learn Set Manager in Verifier, and I still can�t do it. Why?

A.     There are three reasons for this:

  You may not have permission to use this add-in. Check with your project administrator to see if you are assigned to a group that can work with Learn Set Manager mode.

  Learn Set Manager might not be enabled for the project. Again, contact your project administrator.

  A third reason might be that Learnset Manager mode is not properly licensed. Learnset Manager gets its license through a Runtime Server process. If you�ve been able to get into to Learnset Manager mode before, but you cannot now, it may be that Runtime Server has stopped.