Skip Headers
Oracle® Application Server Portal Configuration Guide
10g Release 2 (10.1.2)
B14037-03
  Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
Next
Next
 

H Using TEXTTEST to Check Oracle Text Installation

OracleAS Portal uses the Oracle Text functionality to extend its search capabilities. If you want to check that Oracle Text functionality is working correctly, you can use the utility TEXTTEST. This utility is located at MID_TIER_ORACLE_HOME/portal/admin/texttest/textest.

This appendix contains the following sections:


Note:

This utility only checks Oracle Text functionality that is specifically required by OracleAS Portal.

H.1 When to Use TEXTTEST

Oracle Text functionality is now enabled in OracleAS Portal by default and therefore all new OracleAS Portal installations expect Oracle Text to be present and functioning correctly. The TEXTTEST utility is useful if you want to:

If you choose to disable Oracle Text searching functionality in OracleAS Portal, you do not need to run this utility.

H.2 Before Running TEXTTEST

  1. You need to run the TEXTTEST utility from an Oracle Application Server Oracle home and it requires access to:

    • A working Perl installation (TEXTTEST has been tested with Perl 5.6.1).

    • Perl DBI and DBD::Oracle modules. The DBD::Oracle modules themselves require the Oracle Database client libraries.

      To ensure access, set the path PATH $ORACLE_HOME/perl/bin:$PATH and set the Perl library path setenv PATH $ORACLE_HOME/perl/lib/5.6.1:$PATH.

    All of these are found in an Oracle Application Server Oracle home.

  2. Ensure the correct Oracle home is selected.

    • For UNIX platforms, ensure the ORACLE_HOME environment variable is set and that the library path used by ld includes ORACLE_HOME/ctx/lib. The library path environment variable for the different UNIX platforms are as follows:

      Solaris, Tru64 UNIX, Linux: $LD_LIBRARY_PATH

      HP/UX: $SHLIB_PATH and $LD_LIBRARY_PATH

      IBM AIX: $LIBPATH

    • On Windows, use the Oracle home selector to choose the correct Oracle home.

    This is necessary so that the Perl DBD::Oracle module can find the correct Oracle client libraries. TEXTTEST also makes reference to the Oracle home environment variable. The Oracle home selected must be the Oracle Application Server Oracle home from where you intend to run the TEXTTEST utility.

  3. Ensure that Perl can resolve the Perl module Portal::Text::Test.

    This module resides at:

    ORACLE_HOME/perl/lib/site_perl/5.6.1/Portal/Text/Test.pm
    
    

    If you are using the Perl installation from the Oracle Application Server Oracle home, this is automatically included on the @INC path and no action is necessary. However, if you are using another Perl installation to run the utility, you may need to take steps to ensure that this location is included in the @INC path before running TEXTTEST. One way to do this, is to set the PERL5LIB environment variable to include:

    ORACLE_HOME/perl/lib/site_perl/5.6.1
    ORACLE_HOME/perl/lib/5.6.1:$PERL5LIB
    
    
  4. If necessary, configure some of the tests that TEXTTEST will run.

    For example, if your Oracle Application Server installation is behind a firewall and you perform URL tests that access content on the Internet. See Section H.5, "Configuring TEXTTEST" for more information.

H.3 Running TEXTTEST

The TEXTTEST utility is located at MID_TIER_ORACLE_HOME/portal/admin/texttest/textest. The default document directory is ORACLE_HOME/Portal/admin/texttest/doc.

You can run the TEXTTEST utility from the command line or DOS prompt. If you run the utility with no arguments, usage information is displayed. The command line arguments are detailed subsequently:

ORACLE_HOME/perl/bin/perl texttest -c sys_connect_string [-v] [-k] [-d document_directory] [-t textcase_schema] [-p proxy] [-n noproxy]

Table H-1 TEXTTEST Parameters

Parameter Description

-c

Connect string for the schema to connect with DBA privileges to create the test schema. For example, sys/change_on_install@orcl as sysdba

-v

Show verbose output.

-k

Keep test schema after tests.

-d

Document directory containing documents to upload. The document indexing tests use these uploaded documents. If not specified, TEXTTEST looks for a directory called 'doc' in the same location as this script.

-t

Name of the test schema. This is the schema that is created and in which the tests are run. Default is TEXTCASE. The password will be the same as the schema name. If it already exists, the existing schema will be used. However without the -k option it will still be dropped at the end of the test, so be careful.

-p

Proxy to use for the URL indexing tests, For example, global.uk.mycompany.com:80. The port is optional. The same proxy is used for both HTTP and FTP URLs.

-n

No proxy domains, comma separated list of up to 16 domains that the proxy will not be used for. For example, uk.mycompany.com,us.mycompany.com

-u

URL indexing test data file location.


The only mandatory argument is -c (the database connection information) and this must be a SQL*Plus style connect string. The schema specified is the one that is used to connect to the database. A separate schema will be created for running the tests.

The schema specified in the -c argument is not the schema used to run the tests. This schema needs DBA privileges. If you need to connect with a particular role, such as SYSDBA, when connecting to the SYS schema, specify this in the normal SQL*Plus format.

Note that if the -c argument contains spaces, you must add quotes. For example,

texttest -c 'sys/change_on_install@orcl as sysdba'

The -t argument specifies the name of the schema in which the tests are run. The default schema name is TEXTCASE. This schema is created in the early stages of the tests and is normally dropped at the end of the tests. You must ensure that this schema does not already exist in the database. If the test schema already exists, it is used but is dropped at the end of the testing.

H.4 Understanding TEXTTEST Results

By default, the output of the TEXTTEST is a simple statement of whether each test passed or failed, that is, OK or Not OK. For more detailed information about the tests and what causes them to fail, run TEXTTEST in verbose mode, that is, specifying the -v command line flag. The information displayed when the verbose mode is enabled is shown in more detail later.

See Section H.6, "Descriptions of TEXTTEST Tests" for details about why a test fails. Remember that if some of the tests fail, it may cause other tests to fail later on. For example, if the first connect to the database fails, then all subsequent tests also fail. Therefore, it is recommended that you investigate failures in the order they occur.

H.5 Configuring TEXTTEST

Use the file ORACLE_HOME/perl/lib/site_perl/5.6.1/Portal/Text/Config.pm to customize the default behavior of TEXTTEST. This file contains a Perl hash definition which itself contains definitions for various default values.

In most cases these values can be overridden by specifying command line arguments. Refer to Section H.3, "Running TEXTTEST" for details. If there is a default value defined in Config.pm and no command line value is specified, then the value from Config.pm is used.

Edit Config.pm if you want to change the default values permanently. This may be useful, for example, you always want to have proxy settings defined and you do not want to specify them every time on the command line. However, it is possible to successfully run TEXTTEST without modifying this configuration file.

H.5.1 Configuring Document Tests

OracleAS Portal uses Oracle Text functionality to search document content that is uploaded into the portal. When content is uploaded it is stored within OracleAS Portal database tables. Before it can be searched, the content must be indexed. During the indexing process Oracle Text processes each of the uploaded documents in turn. If the document is in a binary format (for example, a Word Document, or a Powerpoint document) it must be filtered and converted to plain text before it is indexed.

To test this functionality TEXTTEST creates a document table, uploads a number of files and attempts to filter them. The files that are uploaded are taken from a document directory. The default location is configured in Config.pm as ORACLE_HOME/Portal/admin/texttest/doc.

Oracle Text cannot filter all documents. Therefore, some documents that are expected to fail the indexing test can be placed into the document directory, with a specific error reported. Because the error is expected, the test should still pass when the error occurs.

To test this behavior, you can configure a list of expected exceptions in an exceptions file. This file lists the file name and the expected error. You must enter one file name in each line followed by the expected error, separated by a space. If the file name contains a space, it should be escaped using \ as an escape character.

The error is treated as a Perl regular expression so it does not need to contain the whole error message. At the simplest level, you can specify part of the error string and this will match. This enables you to specify just the error code, for example. More complicated Perl regular expressions are also permissible. Refer to perldoc (on perlre page) for more information on Perl regular expressions. If the expected error is simply *, any exception is expected, and no failure whilst indexing this document will cause a test failure.

For example, the file might contain these four lines:

searchnotes.zip DRG-11207: user filter command exited with status 1 
# The following PDF has security and cannot be filtered
my\ secured\ pdf.pdf DRG-11207 
search.jar *

The first line includes the entire error. The second line is a comment that is ignored. The third line treats any DRG-11207 error as a expected. The fourth line can fail with any error and the test still passes.

By default, the document indexing exceptions file is called index_exceptions and it is located in the document indexing directory (configured in Config.pm). If the location is specified as a relative path, it is relative to the document directory.

Note that due to limitations in the Perl DBD::Oracle module it's not possible to stream the documents from the file system to the database. Instead the entire document is loaded into memory before being uploaded to the database. This means that it is necessary to have enough memory to contain the entire document. Only enough space for each document at a time is required.

H.5.2 Configuring URL Tests

OracleAS Portal uses Oracle Text functionality to fetch URLs that are listed as URL attributes, either on URL items, other items or pages. Once the fetched content is indexed and becomes searchable.

TEXTTEST tests this functionality by creating a similar URL index. The test data for URL testing consists of a list of URLs. TEXTTEST loads the URLs from a URL data file. Each line in the data file contains a URL to attempt to index. It may also optionally contain an error message. If that error is found while indexing the corresponding URL, it is accepted as an expected error and does not cause the test to fail.

The expected error message is taken as a Perl regular expression that is matched against the error obtained from indexing. You must separate the expected error from the URL by a space character. If the expected error is specified as * then any error is treated as expected and does not cause the test to fail. For example:

http://www.oracle.com
http://www.google.com DRG-11614: URL store: communication with host specified in http://www.google.com timed out
http://www.imaginaryurl.com DRG-11612: URL store: unknown host specified in http://www.imaginaryurl.com
http://www.anotherimaginaryurl.com DRG-11612
http://www.expectederror.com *

The first URL is expected to be found. An error is reported if it cannot be indexed.

http://www.google.com is expected to timeout (perhaps because the portal is behind a firewall and no proxies are specified). If this failure occurs the test will still pass.

http://www.imaginaryurl.com is expected to fail with an unknown host error.

http://www.anotherimaginaryurl.com is also expected to fail with an unknown host error. Note that it's not necessary to specify the whole error string. Because it's treated as a regular expression just the error code will match. If it fails with this error the test will still pass.

http://www.expectederror.com will never cause the test to fail. We have said that regardless of any errors that occur, we should still pass the test.

Section 8.3.10, "Viewing Indexing Errors" describes some of the most common Oracle Text URL error messages.

Expected and unexpected errors are reported when TEXTTEST is run in verbose mode (-v command line flag). When TEXTTEST is run it opens the URL data file and uses it to populate the URL test table. This enables you to amend and augment the list of URLs used for testing by changing the contents of the file.

The default location for the URL data file is specified in the file Config.pm. Alternatively you can specify a URL test data file using the -u command line argument when running TEXTTEST. For example:

texttest -c 'sys/change_on_install@orcl as sysdba' -u ORACLE_HOME/Portal/texttest/url

ORACLE_HOME/Portal/texttest/url is the default location for the URL data file, within the Oracle Application Server Oracle home.

You may change the URL details if you think a specific URL is causing problems in your portal installation. Or perhaps, your Oracle Application Server installation resides behind a firewall and you wish to change the URL test data to include URLs that are local to your intranet, rather than public URLs on the Internet.

H.5.3 URL Tests and Proxies

If your portal installation resides behind a firewall it may be necessary to configure Oracle Text to use a proxy before it can fetch URLs that reside beyond the firewall.

If you run TEXTTEST in these circumstances without setting proxies, the URL indexing tests fail. In this case you have three choices:

  • Remove the failing URLs from the test data set. Simply remove the line from the URL data file.

  • Mark the offending tests as expected to fail. Do this by placing the URL followed by the expected error message in the URL data file.

  • Specify a proxy to use. See Section H.5.4, "Specifying Proxies for Use with URL Indexing Tests" for details.

H.5.4 Specifying Proxies for Use with URL Indexing Tests

You can specify a proxy to use in two locations:

  • In the file Config.pm that contains separate settings for ftp_proxy and http_proxy.

  • Using the -p parameter for the TEXTTEST script. In this case, the same proxy is used for both HTTP and FTP proxies.

In both cases the form of the proxy should be <hostname>.<domain>:<port>. The port is optional. For example,

www-proxy.us.abc.com:80
emeacache.abc.com

The -n command line argument and the no_proxy Config.pm setting can both be used to specify a list of domains for which the proxy should not be used. The list should be comma separated. For example,

uk.abc.com,us.abc.com,abc.com

H.6 Descriptions of TEXTTEST Tests

This section describes each of the tests that TEXTTEST performs and outlines some of the common causes for failure of each test.

H.6.1 Connect to Database as User sys

Description:

Connects to the database as the privileged user used to create the test schema. This is referred to the sys user or the sys schema. However, it does not have to be the user sys, any sufficiently privileged user will suffice.

Possible cause of failure:

  • Incorrect schema name or password.

  • If the user, such as sys, needs to connect with a specific role then the roles must be specified in the in the connect string in the usual format, that is, sys/change_on_install as sysdba.

When this test fails, it causes other tests to fail.

H.6.2 Create textcase Schema

Description:

Creates the schema into which test objects are installed. By default, this schema is called textcase and it is referred to as the test schema.

Possible cause of failure:

  • The user with which TEXTTEST is connected, does not have privileges to create other users.

  • There are several other reasons why it might not be possible to create a new schema, for example, there may be insufficient space in the database.

When this test fails, it causes other tests to fail.

H.6.3 Grant DBA Role to textcase Schema

Description:

Grants the DBA role to the test schema. This allows it to directly create and remove objects from the ctxsys schema.

Possible cause of failure:

  • The user with which TEXTTEST is connected, does not have the necessary privileges to grant the DBA role to another user. It must have the DBA role itself to do this.

H.6.4 Grant CTXAPP Role to textcase Schema

Description:

Grants the CTXAPP role to the test schema. This is required when using Oracle Text features.

Possible cause of failure:

  • The user with which TEXTTEST is connected, does not have the necessary privileges to grant CTXAPP to another user. It must have the DBA role itself to do this.

  • The CTXAPP role is missing. This indicates an incomplete, corrupt or missing Oracle Text installation.

H.6.5 Disconnect From sys

Description:

TEXTTEST disconnects from the sys schema to reconnect to the test schema.

Possible cause of failure:

  • No obvious cause of failure.

H.6.6 Connect to textcase Schema

Description:

TEXTTEST reconnects to the test schema to begin creating schema objects and running Oracle Text tests.

Possible cause of failure:

  • No obvious cause of failure.

H.6.7 Create textcase Item Related Tables

Description:

Creates the tables used for testing item indexing with a user datastore.

Possible causes of failure:

  • No obvious cause of failure.

  • General database problems such as insufficient free tablespace to complete the operation.

H.6.8 Populate Item Tables

Description:

Populates the tables used for item indexing tests. They are populated using data held within the TEXTTEST script itself.

Possible cause of failure:

  • No obvious cause of failure.

H.6.9 Create Document Table

Description:

Creates the table used for document filtering and indexing tests.

Possible cause of failure:

  • No obvious cause of failure.

H.6.10 Populate Document Table

Description:

Populates the document table from a specified document directory.

Possible cause of failure:

  • The specified document directory cannot be found or is not readable. The files within the document directory must be readable.

  • Insufficient memory on the computer where TEXTTEST is running to hold any one of the documents in memory.

H.6.11 Create URL Table

Description:

Creates the table used for URL indexing tests.

Possible cause of failure:

  • No obvious cause of failure.

H.6.12 Populate URL Table

Description:

Populates the tables used for URL indexing tests. They are populated from the URL data file. See Section H.6.11, "Create URL Table" for details.

Possible cause of failure:

  • The URL indexing data file cannot be found, or is not readable.

  • Data within the URL data file is in an incorrect format.

H.6.13 Create Oracle Text Datastore Procedure

Description:

Creates a datastore procedure in the ctxsys schema. The test user has DBA privileges and this procedure is created or replaced, so if the ctxsys schema is installed, there should not be a problem.

Possible cause of failure:

  • The ctxsys schema is not present, which also implies that Oracle Text is not installed in the database.

H.6.14 Create Oracle Text Preferences

Description:

Creates the Oracle Text preferences (not including the Lexer preferences). Any existing preferences are dropped to avoid clashes.

Possible cause of failure:

  • Problems with the Oracle Text installation.

  • Problems with the compatibility of the preferences that TEXTTEST is attempting to create with this Oracle Text version, that is, preference version is not as expected.

H.6.15 Create Lexer Preferences

Description:

Creates the Oracle Text lexer preferences. Any existing preferences are dropped to avoid clashes.

Possible cause of failure:

  • Problems with the Oracle Text installation.

  • Problems with the compatibility of the preferences that TEXTTEST is attempting to create with this Oracle Text version, that is, preference version is not as expected.

H.6.16 Create Section Group and Zone Sections

Description:

Creates the section groups and zone sections for the item indexing tests.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.17 Create Oracle Text Item Index

Description:

Creates the Oracle Text index for testing item indexing with a user datastore. This test does not populate the index.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.18 Create Oracle Text Document Index

Description:

Creates the Oracle Text index for testing document indexing. This test does not populate the index.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.19 Create Oracle Text URL Index

Description:

Creates the Oracle Text index for testing URL indexing. This test does not populate the index.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.20 Touch All Item Content So That Pending

Description:

Updates all of the rows in the items test table so that they are placed in the Oracle Text pending queue.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.21 Touch All Document Content So That Pending

Description:

Updates all of the rows in the document test table so that they are placed in the Oracle Text pending queue.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.22 Touch All URL Content So That Pending

Description:

Updates all of the rows in the URL test table so that they are placed in the Oracle Text pending queue.

Possible cause of failure:

  • No obvious cause of failure.

  • Possibly a problem with the Oracle Text installation, or one of the previous test having failed.

H.6.23 Synchronize Item Index

Description:

Synchronizes the Oracle Text index on the item indexing test tables. This causes the content to be indexed.

Because the data set used for the item indexing is controlled and internal to the TEXTTEST script, this test is always expected to pass.

Possible cause of failure:

  • A previous test has failed.

  • Possibly a problem with the Oracle Text installation. Verify the Oracle Text installation and reinstall if necessary. Ensure that you complete all manual steps for any database upgrades, as these often contain Oracle Text related steps.

H.6.24 Synchronize Document Index

Description:

Synchronizes the Oracle Text index on the document indexing test table. This causes the content to be indexed.

Possible cause of failure:

  • One of the documents uploaded for the test could not be filtered. This is not necessarily a problem as the document might not be in one of the formats that are filterable by Oracle Text.

    Consult the Oracle Text Reference (see chapter on supported formats). Either remove the document, or mark it as an expected failure (see Section H.5.1, "Configuring Document Tests" for details).

  • An unexpected indexing failure, either caused by a bug in the filtering software or by incorrect configuration.

    Consult the Oracle Text Reference and Chapter 8, "Configuring the Search Features in OracleAS Portal" for more information. If the Oracle Text installation is configured correctly and the document format is a supported one but it still cannot be filtered, contact Oracle Support Services.

H.6.25 Synchronize URL Index

Description:

Synchronizes the Oracle Text index on the URL indexing test tables. This causes the content to be indexed.

Possible cause of failure:

  • One of the URLs specified in the URL indexing test data may not be returning HTML or plain text that can be indexed by Oracle Text. This can happen for a number of reasons. The URL may be incorrect or the site might be unavailable.

  • If the database instance is behind a firewall and the URL is beyond the firewall, then it might be necessary to configure the tests to use a proxy server. See Section H.5.2, "Configuring URL Tests" for more information. If the URL is expected to fail, you can marked it as such in the URL test data so that this test will pass.

H.6.26 Drop Datastore Procedure from ctxsys

Description:

Drops the datastore procedure created in the ctxsys schema.

This test is not carried out if the -k option is used to keep the test schema once the tests are completed. See Section H.3, "Running TEXTTEST" for more information.

Possible cause of failure:

  • No obvious cause of failure.

H.6.27 Disconnect From textcase Schema

Description:

Disconnects from the test schema.

Possible cause of failure:

  • No obvious cause of failure.

H.6.28 Connect As User sys

Description:

Reconnects to the sys schema to drop the test schema.

This test is not carried out if the -k option is used to keep the test schema once the tests are completed. See Section H.3, "Running TEXTTEST" for more information.

Possible cause of failure:

  • No obvious cause of failure.

H.6.29 Drop textcase Schema

Description:

Drops the test schema.

This test is not carried out if the -k option is used to keep the test schema once the tests are completed. See Section H.3, "Running TEXTTEST" for more information.

Possible cause of failure:

  • No obvious cause of failure.

H.6.30 Disconnect From Database

Description:

Disconnects from the sys schema.

Possible cause of failure:

  • No obvious cause of failure.