Oracle8 ConText Cartridge Workbench User's Guide
Release 2.3

A57700-01

Library

Product

Contents

Index

Prev Next

3
Input/Output (I/O) Utility

This chapter describes how to use the Oracle8 ConText Cartridge Input/Output (I/O) utility. You can use this utility in Windows 32-bit environments such as Windows NT and Windows 95.

The topics covered in this chapter are:

Overview of the I/O Utility

The I/O utility enables you to move text from database to client-side files in a Windows 32-bit environment and vice-versa.

The I/O utility can be used to perform the following operations:

These operations are performed in a sequence defined by a parameter file that you specify on the command line.

Command-Line Syntax

The I/O utility has the following command-line syntax:

ctxio32 [-s] [-w] [-p file] [-l file] [-v editor] [-d connect_string]

where:

Parameter   Description  

-s  

Shows login box.

This allows the user to confirm the connect string and the name of the parameter file.  

-w  

Displays status window.

Messages in the parameter file are displayed in this window during processing.  

-p file  

Specifies the name of parameter file.

For more information about the format of the file, see "Parameter File Structure" in this chapter.  

-l file  

Specifies the name of log file.

Error and debug messages, as well as any messages in the parameter file, are output to this file. No messages are reported directly to the user.  

-v editor  

Displays log file, using given editor (a Windows executable, such as Notepad).  

-d connect_string  

Specifies an explicit database username/password connection string.

If "-d none" is given, there is no log on and no prompt so no database session is started. This option would normally be used to spawn an operating system command.

If -d is omitted, the user is prompted to enter the connection string.  

Parameter File Structure

The general structure of a parameter file is similar to the tag structure used in HTML/SGML:

<START>
many lines of text
<END>
<START>
many lines of text
<END>

Start Tag   End Tag   Text   Description  

<SQL>  

</SQL>  

Limited to a maximum of 32k of SQL split over many lines. Keep each line below 2k.

If the first character is @, the SQL is read from the file that follows; for example, @oco.sql loads SQL from the file oco.sql, and executes it.

SQL read from a file must not exceed 5Mb.  

Encloses SQL commands.  

<GET>  

</GET>  

Line 1: Destination filename

Line 2: Table containing source document

Line 3: Column containing source document

Line 4: WHERE clause (excluding WHERE keyword)

 

Writes a document from a database table to an operating system file.  

<PUT>  

</PUT>  

Line 1: Source filename

Line 2: Destination table name

Line 3: Destination column name

Line 4: WHERE clause (excluding WHERE keyword)

 

Puts an operating system file into a database table.

 

<REG>  

</REG>  

Reg_key, Reg_value [, Trunc_string]  

Registration entries can be used within <EXE> </EXE> pairs.

Reg_key is the registration key, as it appears in the lefthand pane of the Registry Editor window.

Reg_value is the registry entry name as it appears in the righthand pane of the Registry Editor window. This defaults to "(Default)".

Trunc_string is optional and causes truncation of the fetched registry value from the given string. Many useful registry entries have '%1' or similar placeholders. See the parameter file examples further on.  

<LOAD>  

</LOAD>  

Line 1:

Table_name [TAB_OPTIONS]

Line2:

Column_name[COL_OPTIONS],Column_name[COL_OPTIONS]...

Line 3+:

Column_data, column_data, ...

Note:

The backslash represents the escape character. Therefore, to obtain \, use \\. Or to obtain ", use \\'.  

Line 1: TAB_OPTIONS := (Replace)

Replace: Data is updated rather than inserted into the database. This option is used if the row that satisfies the 'where' clause already exists.

Line 2:

COL_OPTIONS := (COL_OPTION, COL_OPTION...) COL_OPTION :=
ReadFromFile |
Replace "X" "Y" ["C"] |
Ignore "X" "Y" ["Z"] ["C"] |
ExtractOne "X" "Y" ["Z"] ["C"] |
ExtractAll "X" "Y" ["Z"] ["C"] |
SaveDir "X" |
TempFile "X" |
Where |
Directory "X" |
UseFilter "X" |
Sequence n |
LoadDir "A" "B" "C" "D" "E" "F" |
GeneratedFile

ReadFromFile

Reads the contents of the given file into the database column.

Replace "X" "Y" ["C"]

After reading the file replaces all occurrences of X with Y. There can be more than one Replace call per column. X and Y are case-sensitive if C is omitted or if C equals "N". X and Y are case-insensitive if C equals "Y".

Ignore "X" "Y" ["Z"] ["C"]

After reading the file removes all text between X and Y inclusive of X and Y. If Z is given, remove the text only if Z occurs between X and Y. X, Y and Z are case-sensitive if C is omitted or if C equals "N". X and Y are case-insensitive if C equals "Y".  

 

 

 

ExtractOne "X" "Y" ["Z"] ["C"]

ExtractAll "X" "Y" ["Z"] ["C"]

After reading the file, extracts all text between X and Y, excluding X and Y. If Z is given, extracts the text only if Z occurs between X and Y. ExtractOne extracts only the first instance of the text; ExtractAll extracts all instances with a carriage return/line feed between instances. X, Y and Z are case-sensitive if C is omitted or if C equals "N". X and Y are case-insensitive if C equals "Y".

SaveDir "X"

This parameter copies the post-filtered versions of the input files into directory X. This is useful for checking that search/replace processing has worked as expected.

TempFile "X"

Any temporary files are created by default in the home directory; X can be used to specify another file name that is used instead. Typically, this is not required.

Where

This defines the columns to be used to define the where clause. This is used to insert the file data into the database and to perform the updates during Replace mode.

Directory "X"

The file names given refer to files in this directory. X must end in a double backslash ("\\").

UseFilter "X"

For each file, this spawns a user-defined filter program to pre-filter the files before any Extract, Replace or Ignore actions. The program must take an input file as the first argument, and an output file as the second argument. X is the program name (including the full path).  

 

 

 

Sequence n

This generates a number of the value of this column. This value is initially n (default = 1).

LoadDir "A" "B" "C" "D" "E" "F"

This recursively scans files below directory A (which must have a trailing slash) and looks for files of type B (containing wildcards and separated by ';').

Use C as the directory separator. Prefix all files with D. Recursion ceases when more than E directory levels is reached (a value of 0 means no limit is set). Do not recurse into directories F (where multiple directories are separated by ';').

See the LoadDir Example further on.

GeneratedFile

This is related to LoadDir. The generated file from LoadDir is used as input for this column.

Line 3: Column_data = Strings, dates and so on should be quoted as if they were entered in an INSERT SQL statement.

Note:

Only ASCII files can be filtered with Replace, Ignore, ExtractOne, ExtractAll.  

<EXE>  

</EXE>  

OS command.  

Operating system command. This can be split over many lines. If so, carriage returns are removed and replaced by spaces.  

<QU>  

</QU>  

Line 1: Question

Line 2: Yes_label

Line 3: No_label  

This prompts the user with the given question and resumes execution after the line starting 'Yes_label:' if the user responds 'yes'. If the user responds 'no', execution is continued after the line starting 'No_label:'.

Alternatively, labels may take the form '#<string>'  

<GOTO>  

label  

None.  

Resumes execution after the line starting 'label:'  

<MSG>  

end of line  

Message  

Displays message in log file and, if -w option is chosen, displays in status window.

Note:

<MESS> is still supported as an alternative to <MSG>.  

<ERRORCODE>  

end of line  

Error number  

Sets the error code to the given number.

If a command fails, and the current error code is non-zero, then that error code is returned by ctxio32.exe. If the -r command line option is used in this case, the error code is also placed in the registry at the following location:

"HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\ConText Workbench\ctxio_ret"  

<AUTOCOMMIT>  

end of line  

On | Off  

Sets the standard autocommit function. If On, there is a commit after every SQL statement until either it is turned Off, or the end of the session.  

#  

end of line  

None  

Parameter file comment - not used.  

Examples

This section provides the following four examples for using the I/O utility:

LoadDir Example

"y:\\webpages\\" "*.htm *.html" "/" "http://oracle.uk.oracle.com" 20 ""

This example loads all .htm and .html files into and below y:\webpages (up to a maximum of 20 levels). The path from y:\webpages\ is then built by using '/' as the separator, and this relative path is then prefixed by 'http://oracle.uk.oracle.com/'.

So, if y:\webpages contains a.htm, subdir\b.html and subdir\c.htm, then the following three strings are loaded into the database:

'http://oracle.uk.oracle.com/a.htm'
'http://oracle.uk.oracle.com/subdir/b.html'
'http://oracle.uk.oracle.com/subdir/c.htm'

GeneratedFile Example

Head(GeneratedFile,ExtractOne "<TITLE>" "</TITLE>" "" "Y")

This example loads the file found by LoadDir and extract the text between 'TITLE' and '/TITLE'. Case is ignored because of the "Y" parameter.

Simple Parameter File Example

The file below is a sample CTXIO32 parameter file. It is installed as part of the ConText Workbench.

The command to execute this parameter file is:

ctxio32 -v notepad -s -w -p pfile.txt -l log.txt -d ctxdemo/ctxdemo 

#
# CTXIO sample script
# File : pfile.txt

#
# Script split into 2
#
# 1) First  section (#ReplaceAll)  replaces all  employees.
# 2) Second section (#ReplaceSome) replaces some employees.
#
# The <QU> .. </QU> section asks a question and jumps to the relevant tag
#

<QU>
Do you wish to replace ALL employees?
#ReplaceAll
#ReplaceSome
</QU>

#
# -----------   This section replaces all current employees -----------

#
#ReplaceAll


<MESS>Drop employee table
<SQL>
DROP TABLE CTXIO_EMP
</SQL>

<MESS>Create employee table
<SQL>
CREATE TABLE CTXIO_EMP (EMPNO NUMBER, EMPNAME VARCHAR2(100), RESUME LONG)
</SQL>
<SQL>
alter table ctxio_emp add constraint unique_ctxioemp unique(empno)
</SQL>

<MESS>Truncate employee table
<SQL>
truncate table CTXIO_EMP
</SQL>

<MESS>Loading ALL employees into database...
# Use SaveDir so we can check search and replace has required affect
<LOAD>
CTXIO_EMP
EMPNO(WHERE), EMPNAME, RESUME(ReadFromFile, Ignore "<Confidential>" "</Confidential>", Replace 
"brewing" "chemistry", Replace "Brewing" "Chemistry", SaveDir "temp")
1,"Joe Bloggs","jbloggs.htm"
2,"Ray Smith","rsmith.htm"
3,"Lisa Turner","lturner.htm"
</LOAD>

# Skip next bit which is for single update only

<GOTO> #Common

#
# --------------   This section updates the resumes of some employees --------------
#
#ReplaceSome

<MESS>Replacing resumes in database...
# Use SaveDir so we can check search and replace has required affect
<LOAD>
CTXIO_EMP (REPLACE)
EMPNO(WHERE), EMPNAME, RESUME(ReadFromFile, Ignore "<Confidential>" "</Confidential>", Replace 
"brewing" "chemistry", Replace "Brewing" "Chemistry", SaveDir "temp")
2,"Ray Smith","rsmith.htm"
</LOAD>

# Load this resume without search and replace
<PUT>
jbloggs.htm
ctxio_emp
resume
EMPNO=1
</PUT>

#
# ----------- This section common to both -----------

#
#Common

# Check Smith's CV was filtered and stored OK
<GET>
temp\rsmith.htm
ctxio_emp
resume
EMPNO=2
</GET>


#
# ----------- Create OCO index -----------
#
<MESS>Dropping policy and index (may produce errors but this is OK)...
<SQL>
begin 
ctxsys.ctx_ddl.drop_index('CTXIO_EMPRES');
end;
</SQL>
<SQL>
begin 
ctxsys.ctx_ddl.drop_policy('CTXIO_EMPRES');
end;
</SQL>

<MESS>Creating policy on test table and creating index...
<SQL>
begin 
ctxsys.ctx_ddl.create_policy(policy_name=>'CTXIO_EMPRES',
colspec=>'CTXIO_EMP.RESUME',
textkey=>'EMPNO'); 
end;
</SQL>
<SQL>
begin
ctxsys.ctx_ddl.create_index('CTXIO_EMPRES');
end;
</SQL>

#
# ----------- Do some OCO querying -----------
#

<MESS>Create marked up table
<SQL>
drop table mutab
</SQL>
<SQL>
create table mutab (id number, document long)
</SQL>

<MESS>Get highlights...
<SQL>
begin 
  ctxsys.ctx_query.highlight(CSPEC=>'CTXIO_EMPRES', TEXTKEY=>2,
   query=>'chemistry,Chemistry', mutab=>'MUTAB', ID=>999,starttag=>'<EM>',
   endtag=>'</EM>');
end;
</SQL>

<MESS>Write marked up text to file...
<GET>
mudoc.htm
mutab
document
id=999
</GET>

# This views the marked-up resume in notepad
#<EXE>
#notepad mudoc.htm
#</EXE>

# This views the HTML resume in Netscape
<EXE>
<REG>HKEY_CLASSES_ROOT\NetscapeMarkup\shell\open\command,,"</REG> file:///mudoc.htm
</EXE>

#Finished

Advanced Parameter File Example

This example demonstrates how to load a website. In particular, it shows how to use the LoadDir option of the I/O utility.


Note:

This example is not included as part of the ConText Workbench distribution

 

<MESS>Drop table
<SQL>
DROP TABLE WEBPAGE
</SQL>
<SQL>
DROP TABLE WEBPAGE_OPTIONS
</SQL>
<MESS>Create tables..
<SQL>
CREATE TABLE WEBPAGE(DOCNO NUMBER, DOC LONG, HEAD VARCHAR2(1000))
</SQL>
<SQL>
CREATE TABLE WEBPAGE_OPTIONS(NAME VARCHAR2(100), VALUE LONG, DES VARCHAR2(1000))
</SQL>
<SQL>
CREATE TABLE WEBPAGE_HL(ID NUMBER, DOCUMENT LONG)
</SQL>
<MESS>Create sequence
<SQL>
CREATE SEQUENCE HL_SEQ
</SQL>
<SQL>
alter table WEBPAGE add constraint unique_WEBPAGE unique(docno)
</SQL>
<MESS>Truncate table
<SQL>
truncate table WEBPAGE
</SQL>
<AUTOCOMMIT> On
<MESS>Scanning and loading roll ...
<LOAD>
WEBPAGE
Docno(Where, Sequence 1), Doc(LoadDir "j:\\www\\ocowsite\\" "*.htm;*.html" "/" "http://
roll.uk.oracle.com/" 20 "www"), Head(GeneratedFile, ExtractOne "<TITLE>" "</TITLE>" "" "Y")
</LOAD>
<MESS>Scanning and loading shake ...
<LOAD>
WEBPAGE
Docno(Where, Sequence 1000), Doc(LoadDir "y:\\webpages\\" "*.htm;*.html" "/" "http://
shake.uk.oracle.com/" 20), Head(GeneratedFile, ExtractOne "<TITLE>" "</TITLE>" "" "Y")
</LOAD>
<MESS>Scanning and loading uksn16 ...
<LOAD>
WEBPAGE
Docno(Where, Sequence 2000), Doc(LoadDir "h:\\etc\\httpd\\htdocs\\" "*.htm;*.html" "/" "http://
uksn16.uk.oracle.com/" 20), Head(GeneratedFile, ExtractOne "<TITLE>" "</TITLE>" "" "Y")
</LOAD>
<MESS>Dropping policy and index (may produce errors but this is OK)...
<SQL>
begin 
ctxsys.ctx_ddl.drop_index('WEBPAGE');
end;
</SQL>
<SQL>
begin 
ctxsys.ctx_ddl.drop_policy('WEBPAGE');
end;
</SQL>
<MESS>Creating policy on test table and creating index...
<SQL>
begin 
ctxsys.ctx_ddl.create_policy(policy_name=>'WEBPAGE',
colspec=>'WEBPAGE.DOC', 
filter_pref => 'CTXSYS.HTML_FILTER', 
dstore_pref => 'CTXSYS.DEFAULT_URL',
textkey=>'DOCNO'); 
end;
</SQL>
<SQL>
begin
ctxsys.ctx_ddl.create_index('WEBPAGE');
end;
</SQL>
<SQL>
TRUNCATE TABLE WEBPAGE_OPTIONS
</SQL>
<MESS>Initializing search page options ...
<LOAD>
WEBPAGE_OPTIONS
NAME, VALUE, DES
'GenShortTitle','OCO Websearch','Short generic title'
'FrontTitle','Oracle8 ConText Cartridge Web Site Search','Title of front page'
'FrontHeading','<center><h1>Oracle8 ConText Cartridge Web Site Search</h1><center>','Heading of 
front page'
'FrontIncludeCount','Y','Include web pages count (Y/N)'
'FrontContentsPreQuery','','Contents of page after count and before query section (max 32k)'
'FrontContentsPostQuery','','Contents of page after query section (max 32k)'
'ResultTitle','Oracle8 ConText Cartridge Search Results','Title of results page'
'ResultHeading','<center><h1>Oracle8 ConText Cartridge Search Results</h1></center>','Heading 
of results page'
'ResultShowExpression','Y','Show query expression (Y/N)'
'ResultLabelHeading','Page heading','Label to describe heading'
'ResultLabelScore','Score','Label to describe score'
'ResultLabelPage','Page','Label to describe page'
'ResultHighStart','<Font color=red size=+4>','Highlight on'
'ResultHighEnd','</font>','Highlight off'
'ResultTagStart','<OCOH>','Internal use'
'ResultTagEnd','</OCOH>','Internal use'
'ResultContentsPreTable','','Contents of page before the results table (max 32k)'
'ResultContentsPostTable','','Contents of page after the results table (max 32k)'
'ResultHighNextYN','Y','Show the [Next] link after each highlight (Y/N)'
'ResultHighNext','<font size=-1>[Next]</font>','If ResultHighNextYN=Y, this is  displayed after 
each highlight'
'ResultHighFirst','[First Highlight]','If ResultHighNextYN=Y, this is  displayed at start and 
end of document'
</LOAD>
<PUT>
frontpre.txt
webpage_options
value
name='FrontContentsPreQuery'
</PUT>
<PUT>
frontpos.txt
webpage_options
value
name='FrontContentsPostQuery'
</PUT>
<PUT>
respre.txt
webpage_options
value
name='ResultContentsPreTable'
</PUT>
<PUT>
respos.txt
webpage_options
value
name='ResultContentsPostTable'
</PUT>

The I/O Utility dll Interface

All the functionality of ctxio is incorporated in a single dll (ctxio32l.dll). Developers can call the C routines within this dll directly.

See Also:

For more information, see the required header file, ctxexp.h, which is installed in $ORACLE_HOME/ctxw/io/public as part of the ConText Workbench distribution.

 




Prev

Next
Oracle
Copyright © 1997 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index