Sun Java logo     Previous      Contents      Index      Next     

Sun logo
Sun Java System Portal Server 6 2005Q4 Technical Reference Guide 

Chapter 53
Robot Application Functions - Filtering Support Functions

This chapter contains the following sections:


Introduction

The functions discussed in this chapter are used during filtering to manipulate or generate information on the resource. The robot can then process the resource by calling filtering functions. These functions can be used in Enumeration and Generation filters in the filter.conf file.


assign-source

The assign-source function assigns a new value to a given information source. This permits editing during the filtering process. The function can assign an explicit new value, or it can copy a value from another information source.

Parameters

The following table lists the parameters used with the assign-source function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

dst

Name of the source whose value is to be changed.

value

Specifies an explicit value.

src

Information source to copy to dst

You must specify either a value parameter or a src parameter, but not both.

Example

Data fn=assign-source dst=type src=content-type


assign-type-by-extension

The assign-type-by-extension function uses the resource’s file name to determine its type and assigns this type to the resource for further processing.

The setup-type-by-extension function must be called during setup before assign-type-by-extension function can be used.

Parameters

The following table lists the parameter used with the assign-type-by-extension function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

src

Source of file name to compare. If you do not specify a source, the default is the resource’s path.

Example

MetaData fn=assign-type-by-extclear-source


clear-source

The clear-source function deletes the specified data source. You typically do not need to perform this function. You can create or replace a source by using the assign-source function.

Parameters

The following table lists the parameter used with the clear-source function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

src

Name of source to delete.

Example

The following example deletes the path source:

MetaData fn=clear-source src=path


convert-to-html

The convert-to-html function converts the current resource into an HTML file for further processing, if its type matches a specified MIME type. The conversion filter automatically detects the type of the file it is converting.

Parameters

The following table lists the parameter used with the convert-to-html function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

type

MIME type from which to convert.

Example

The following sequence of function calls causes the filter to convert all Adobe Acrobat PDF files, Microsoft RTF files, and FrameMaker MIF files to HTML, as well as any files whose type was not specified by the server that delivered it.

Data fn=convert-to-html type=application/pdf

Data fn=convert-to-html type=application/rtf

Data fn=convert-to-html type=application/x-mif

Data fn=convert-to-html type=unknown


copy-attribute

The copy-attribute function copies the value from one field in the resource description into another.

Parameters

The following table lists the parameters used with the copy-attribute function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

src

Field in the resource description from which to copy.

dst

Item in the resource description into which to copy the source.

truncate

Maximum length of the source to copy.

clean

Boolean parameter indicating whether to fix truncated text (such as not leaving partial words). This parameter is false by default.

Example

Generate fn=copy-attribute \

src=partial-text dst=description truncate=200 clean=true


generate-by-exact

The generate-by-exact function generates a source with a specified value, but only if an existing source exactly matches another value.

Parameters

The following table lists the parameters used with the generate-by-exact function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

dst

Name of source to generate.

value

Value to assign dst.

src

Source against which to match.

Example

The following example sets the classification to Siroe if the host is www.siroe.com.

Generate fn="generate-by-exact" match="www.siroe.com:80" src="host" value="Siroe" dst="classification"


generate-by-prefix

This generate-by-prefix function generates a source with a specified value, but only if the prefix of an existing source matches another value.

Parameters

The following table lists the parameters used with the generate-by-prefix function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

dst

Name of the source to generate.

value

Value to assign to dst.

src

Source against which to match.

match

Value to compare to src.

Example

The following example sets the classification to Compass if the protocol prefix is HTTP:

Generate fn="generate-by-prefix" match="http" src="protocol" value="World Wide Web" dst="classification"


generate-by-regex

The generate-by-regex function generates a source with a specified value, but only if an existing source matches a regular expression.

Parameters

The following table lists the parameters used with the generate-by-regex function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

dst

Name of the source to generate.

value

Value to assign to dst.

src

Source against which to match.

match

Regular expression string to compare to src.

Example

The following example sets the classification to Siroe if the host name matches the regular expression *.siroe.com. For example, resources at both developer.siroe.com and home.siroe.com will be classified as Siroe:

Generate fn="generate-by-regex" match="\\*.siroe.com" src="host" value="Siroe" dst="classification"


generate-md5

The generate-md5 function generates an MD5 checksum and adds it to the resource. You can then use the filter-by-md5 function to deny resources with duplicate MD5 checksums.

Parameters

none

Example

Data fn=generate-md5


generate-rd-expires

The generate-rd-expires function generates an expiration date and adds it to the specified source. The function uses metadata such as the HTTP header and HTML <META> tags to obtain any expiration data from the resource. If none exists, it generates an expiration date three months from the current date.

Parameters

The following table lists the parameter used with the generate-rd-expires function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

dst

Name of the source. If you omit it, it defaults to rd-expires.

Example

Generate fn=generate-rd-expires


generate-rd-last-modified

The generate-rd-last-modified function adds the current time to the specified source.

Parameters

The following table lists the parameter used with the generate-rd-last-modified function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

dst

Name of the source. If you omit it, it defaults to rd-last-modified.

Example

Generate fn=generate-last-modified


rename-attribute

The rename-attribute function changes the name of a field in the resource description. It is most useful in cases where, for example, extract-html-meta copies information from a <META> tag into a field, and you want to change the name of the field.

Parameters

The following table lists the parameter used with the generate-rd-last-modified function. The table contains two columns. The first column lists the parameter, and the second column provides a description.

src

String containing a mapping from one name to another.

Example

The following example renames an attribute from author to author-name:

Generate fn=rename-attribute src="author->author-name"



Previous      Contents      Index      Next     


Part No: 819-4159.   Copyright 2005 Sun Microsystems, Inc. All rights reserved.