Sun Java System Portal Server 7.1 Developer's Guide

Chapter 24 Search API

This chapter contains the functions and objects defined in the search.h header file. It contains the following sections:

Functions and Objects

The following table provides an alphabetized version of the functions and objects for your reference.

Table 24–1 Alphabetized Functions and Objects Defined in the search.h File

Search function or object 

Category 

append, increase, reset, SearchBuffer_Create, SearchBuffer_Free

Memory Buffer Management 

Search_Apply, Search_Create, Search_Find, Search_Findval, Search_Free, Search_AttributeCompare, Search_GetAttributeSize, Search_GetTotalSize, Search_GetValueCount, Search_GetValueSize, Search_InsertAVP, Search_Merge, Search_Remove

Search Structure 

Search_AttributeCompare, Search_InsertStr, Search_Rename, Search_Replace, Search_ReplaceMV, Search_ReplaceStr, Search_SqueezeMV, SearchAVPair_Create, SearchAVPair_Free

Attribute-Value Pair Routines 

Search_AttributeCompareMV, Search_Contains, Search_DeleteMV, Search_FindvalMV, Search_Insert, Search_InsertMV, Search_IsMVAttribute, Search_MVAttributeParse, SearchAVPair_IsMV, SearchAVPair_NthValid, SearchAVPair_NthValue, SearchAVPair_NthVsize

Multi-valued Attribute Routines 

Search_ParseInitFile, Search_ParseInitStr, Search_PrintInitFile, Search_PrintInitFn, Search_PrintInitStr, SearchStream_Finish, SearchStream_GetAllowed, SearchStream_GetDenied, SearchStream_IsAllowed, SearchStream_IsEOS, SearchStream_IsParsing, SearchStream_IsPrinting, SearchStream_Parse, SearchStream_Print, SearchStream_SetAllowed, SearchStream_SetDenied, SearchStream_SetFinishFn

Stream Routines for Parsing and Printing Searches 

Search Structure

A Search has a schema-name and it associates a URL with a collection of attribute- value pairs. The schema-name identifies how to interpret the attribute-value pairs. Search supports text and binary data, and attributes can have multiple values.

Example for Search:


@DOCUMENT { http://www.siroe.com/
    title{17}:   Welcome to Siroe!
    author{13}:  Dot Punchcard
}

A Search object has URL and schema-name fields to store its URL and schema_name:

char *url;

/* The URL */

char *schema_name;

/* The Schema-Name, such as @document or @RDMHeader*/

A Search object contains a collection of SearchAVPair objects, which each contain an attribute and one or more values. To access attribute values in a Search, use Search_find() to retrieve the AVPair for the given attribute, or use Search_findval() to retrieve the value string for a given attribute. You must use all lowercase for attribute names for find*(), since only exact attribute name lookups are supported.

You can create Search objects by using the Search_create() function. You can also read Search objects from a Search stream.

Search_Create
NSAPI_PUBLIC Search *Search_Create
(char *schema_name, char *url)

Creates a Search structure with the given schema name and URL.

Search_Free
NSAPI_PUBLIC void Search_Free(Search *)

Frees the given Search structure.

Search_GetTotalSize
NSAPI_PUBLIC int Search_GetTotalSize(Search *s)

Gets the estimated total size of the Search in bytes.

Search_GetAttributeCount
NSAPI_PUBLIC int Search_GetAttributeCount(Search *s)

Gets the number of attributes in the Search.

Search_GetAttributeSize
NSAPI_PUBLIC int Search_GetAttributeSize(Search *s)

Gets the size of the attributes only.

Search_GetValueSize
NSAPI_PUBLIC int Search_GetValueSize(Search *s)

Gets the size of the values only.

Search_GetValueCount
NSAPI_PUBLIC int Search_GetValueCount(Search *s)

Gets the number of values only.

Search_Merge
NSAPI_PUBLIC int Search_Merge(Search *dst, Search *src);

Use this function to merge two Search objects (perform a Union of their attribute-values). It returns non-zero on error; otherwise, returns zero and the ”dst’ Search object contains all the attribute-value pairs from the ”src’ Search object.

If the ”dst’ object contains the same attribute as ”src’, then the attribute becomes a multi-valued attribute and all of the values are copied over to ”dst’. Only multi-valued attributes are copied over. For single-value attributes, discard the value in ”dst’. Currently only “classification” is a multi-valued attribute.

Search_Find
#define Search_Find(search, attribute-name)

Retrieves the AVPair for the given attribute in the given search. For example, the following statement gets the AVPair for the title attribute in the search s:

SearchAVpair avp=Search_Find(s, "title");
Search_Findval
#define Search_Findval(search, attribute-name)

Retrieves the value string for the given attribute in the given search. For example, the following statement prints the value of the title attribute of the search s:

printf("Title = %s\\n", Search_Findval(s, "title"));
Search_Remove
#define Search_Remove(search, attribute-name)

Removes the given attribute from the given search.

Search_Insert
#define Search_Insert(search, attribute-name,
value, value-size)

Inserts the given attribute and the value of the given size as an AVPair into the search.

Search_InsertAVP
#define Search_InsertAVP(search, avpair)

Inserts the given AVPair into the given search.

Search_Apply
#define Search_Apply(search, function, user-date)

Applies the given function with the given argument (user-data) to each AVPair in the given search. For example:


void print_av(Search *s, SearchAVPair *avp, void *unused)
{printf("%s = %s\\n", avp->attribute, avp->value);}

/* print every attribute and value in the search s */
Search_Apply(s, print_av, NULL);

Attribute-Value Pair Routines

Attribute-value pairs contain an attribute and an associated value. The value often is a simple null-terminated string; however, the value can also be binary data. Attribute-value pairs are stored as SearchAVPair structures.

The important fields in a SearchAVPair structure are:

char *attribute;

Attribute string; ”\\0’ terminated

char *value;

Primary value; may be ”\\0’ terminated

size_t vsize;

Number of bytes (8 bits) for primary value

char **values;

Multiple values for multivalued attributes

size_t *vsizes;

The sizes for the values

int nvalues;

Number of values associated with attribute

int last_slot;

Last valid slot - array may contain holes

SearchAVPair_Create

NSAPI_PUBLIC SearchAVPair * SearchAVPair_Create
(char *a, char *v, int vsz);

Creates an AVPair structure with the given attribute a and value v. The value v is a buffer of vsz bytes.

SearchAVPair_Free

NSAPI_PUBLIC void SearchAVPair_Free(SearchAVPair *avp);

Frees the memory used by the given SearchAVPair structure

Search_Replace

NSAPI_PUBLIC int Search_Replace(Search *s, 
char *att, char *val, int valsz);

Replaces the value of an existing attribute att with a new value val of size valsz in the Searches.

Search_InsertStr

#define Search_InsertStr(search, attribute, value)

Inserts the given attribute with the given value into the search. 

Search_ReplaceStr

#define Search_ReplaceStr(search, attribute, value)

Replaces the existing value of the given attribute in the search with the given value. 

Search_Rename

NSAPI_PUBLIC int Search_Rename(Search *s, 
char *old_attr, char *new_attr);

Renames the given attribute to the given new name. 

Search_AttributeCompare

NSAPI_PUBLIC int Search_AttributeCompare
(const char *a1, const char *a2);

Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero if they are different. Case (upper and lower) and trailing -s are ignored when comparing attribute names. The following table illustrates the results of comparing some attribute names.

Multi-valued Attribute Routines

A Search attribute can have multiple values. Search supports the convention of using -NNN to indicate a multivalued attribute. For example, Title-1, Title-2, Title-3, and so on. The -NNN do not need to be sequential positive integers.

The Search Engine supports searching on multi-valued attributes such as the classification attribute. In Search representation, it is represented using classification-1, classification-2, and so on. For example:

classification-1{5}: robot
classification-2{5}: siroe
classification-3{10}: web crawler

Search_AttributeCompareMV

NSAPI_PUBLIC int Search_AttributeCompareMV
(const char *a1, const char *a2);

Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero if they are different. If neither of the attributes is multi-valued then use above routine Search_AttributeCompare(). If one or both of the attributes are multi-value, use the base name of the multi-valued attribute for comparison. The base name of a multi-valued attribute is the name portion before -. For example, the base name of classification-3 is classification.

Search_MVAttributeParse

NSAPI_PUBLIC int Search_MVAttributeParse(char *a)

Returns the multi-valued number of the given attribute, and strips the attribute string of its -NNN indicator; otherwise, returns zero in the case of a normal attribute name. For example, classification-3 returns the number 3.

Search_IsMVAttribute

NSAPI_PUBLIC char *Search_IsMVAttribute(const char *a);

Returns NULL if the given attribute is not a multi-valued attribute; otherwise returns a pointer to where the multi-valued number occurs in the attribute string. For example, for the multi-valued attribute classification-3, it will return the pointer to 3. 

Search_InsertMV

NSAPI_PUBLIC int Search_InsertMV(Search *s, 
char *a, int slot, char *v, int vsz, int useval)

Inserts a new value v at index slot for the given attribute a (in non-multivalue form). If set, the useval flag tells the function to use the given value buffer rather than creating its own copy.

For example: 

Search_InsertMV(s, "classification", 3, 
"web crawler", strlen("web crawler");

Inserts 

classification-3{10}: web crawler

Search_ReplaceMV

NSAPI_PUBLIC int Search_ReplaceMV(Search *s, 
char *a, int slot, char *v, int vsz, int useval);

Search_DeleteMV

NSAPI_PUBLIC int Search_DeleteMV
(Search *s, char *a, int slot)

Deletes the value at the index slot in the attribute a. For example:

Search_DeleteMV(s, "classification", 3)

Deletes classification-3.

Search_FindvalMV

NSAPI_PUBLIC const char *Search_FindvalMV
(Search *s, const char *a, int slot)

Finds the value at the index slot in the attribute a. For example:

Search_FindvalMV(s, "classification", 3)

Returns web crawler (using the previous example). 

Search_SqueezeMV

NSAPI_PUBLIC void Search_SqueezeMV(Search *s)

Forces a renumbering to ensure that the multi-value indexes are sequentially increasing (for example, 1, 2, 3,...). This function can be used to fill in any holes that might have occurred during Search_InsertMV() invocations. For example, to insert values explicitly for the multivalue attribute author-*:


Search_InsertMV(s, "author", 1, "John", 4, 0);
Search_InsertMV(s, "author", 2, "Kevin", 5, 0);
Search_InsertMV(s, "author", 6, "Darren", 6, 0);
Search_InsertMV(s, "author", 9, "Tommy", 5, 0);
Search_FindvalMV(s, "author", 9); /* == "Tommy" */
Search_SqueezeMV(s);
Search_FindvalMV(s, "author", 9); /* == NULL */
Search_FindvalMV(s, "author", 4); /* == "Tommy" */

SearchAVPair_IsMV

#define SearchAVPair_IsMV(avp)

Use this to determine if the AVPair has multiple values or not.

SearchAVPair_NthValid

#define SearchAVPair_NthValid(avp,n)

Use this to determine if the Nth value is valid or not. 

SearchAVPair_NthValue

#define SearchAVPair_NthValue(avp,n)   ((avp)->values[n])

Use this to access the Nth value. For example: 


for (i = 0; i <= avp->last_slot; i++)
  if (SearchAVPair_NthValid(avp, i))
    printf("%s = %s\\n", avp->attribute,
      SearchAVPair_NthValue(avp, i));

SearchAVPair_NthVsize

#define SearchAVPair_NthVsize(avp,n)   ((avp)->vsizes[n])

Use this to get the size of the Nth value. 

Search_Contains

NSAPI_PUBLIC boolean_t Search_Contains
(Search *s, char *a, char *v, int vsz);

Indicates if the given attribute contains the given value. It returns B_TRUE if the value matches one or more of the values of the attribute a in the given Searches.

Stream Routines for Parsing and Printing Searchs

A SearchStream contains one or more Search objects.

The general approach is that you use Search streams to create and process streams of many Search objects. Given a Search stream, you can parse it to get the Search objects from it. Use the parse() routine to get the next Search object in a Search stream. You can use SearchStream_IsEOS() to check whether the last object has been parsed.

You can use filtering functions for a Search stream to specify that certain Search attributes are allowed or denied. If an attribute is allowed, you can parse and print that attribute for Search objects in the stream. If it is denied, you cannot parse or print that attribute of Search objects in the stream.

Search streams can be disk or memory based.

When you create a SearchStream, you need to specify if you will be printing or parsing the Search stream, and if you will be using a memory- or disk-based stream. The functions you need to use will depend on what you will be doing with the Search stream.

For creating a Search streams into which you will be printing Searches, the functions are:

Search_PrintInitFile()

Creates a disk-based stream ready for printing.

Search_PrintInitStr()

Creates a memory-based stream ready for printing.

Search_PrintInitFn()

Creates a generic application-defined stream ready for printing. The given ”write_fn’ is used to print the stream.

To create Search stream from a file or a string containing Search, use the following functions:

Search_ParseInitFile()

Creates a disk-based stream ready for parsing. The stream is created from an input containing Search syntax.

Search_ParseInitStr()

Creates a memory-based stream ready for parsing. The stream is created from an input containing Search syntax.

SearchStream objects have a caller-data field, which you can use as you like:

void *caller_data;   /* hook to be used by caller */

Use SearchStream_Parse() to get the Search objects from the Search stream, and use SearchStream_Print() to write Search objects to the Search stream.

When you’ve finished with the stream, close it by using SearchStream_Finish(). Use SearchStream_SetFinishFn() to trigger the given finish_fn function.

The following example code takes a Search stream in stdin and prints each Search in the stream to stdout. Notice that this code uses Search_ParseInitFile() to create the SearchStream to parse the input file, and uses Search_PrintInitFile() to create the stream to print the Searches to stdout.


SearchStream *searchin = Search_ParseInitFile(stdin);
SearchStream *searchout = Search_PrintInitFile(stdout);
Search *s;
while (!SearchStream_IsEOS(searchin)) {
    if ((s = SearchStream_Parse(searchin)) {
        SearchStream_print(searchout, s);
        Search_Free(s);
    }
}

Search_PrintInitFile

NSAPI_PUBLIC SearchStream *S
earch_PrintInitFile(FILE *file)

Creates a disk-based stream ready for printing. 

Search_PrintInitStr

NSAPI_PUBLIC SearchStream *
Search_PrintInitStr(SearchBuffer *memory)

Creates a memory-based stream ready for printing. 

Search_PrintInitFn

NSAPI_PUBLIC SearchStream *Search_PrintInitFn(int
(*write_fn)(void *data,char *buf, int bufsz), void *data)

Creates a generic application-defined stream ready for printing. The given write_fn is used to print the stream.

This function allows you to hook up your own routine for printing. 

Search_ParseInitFile

NSAPI_PUBLIC SearchStream
*Search_ParseInitFile(FILE *fp)

Creates a disk-based stream ready for parsing. The file must contain Search-formatted data. The function reads Search data from the file object fp.

Search_ParseInitStr

NSAPI_PUBLIC SearchStream *
Search_ParseInitStr(char *buf, int bufsz)

Creates a memory-based stream ready for parsing. The character buffer must contain Search-formatted data. 

SearchStream_Finish

NSAPI_PUBLIC int SearchStream_Finish
(SearchStream *)

Closes the stream when you have finished with it. 

SearchStream_SetFinishFn

NSAPI_PUBLIC int SearchStream_SetFinishFn
(SearchStream *, int (*finish_fn)(SearchStream *))

Allows you to hook up a function for cleaning up after the Search stream finishes its business. The finish_fn will be called when SearchStream_Finish() has finished executing.

SearchStream_Print

#define SearchStream_Print(ss, s)

Prints another Search object to the Search stream ss. Returns 0 on success, or non-zero on error.

SearchStream_Parse

#define SearchStream_Parse(ss)

Parses and returns the next Search object in the Search stream. 

SearchStream_IsEOS

#define SearchStream_IsEOS(s)

Returns 1 (true) if the Search stream has been exhausted. 

SearchStream_IsPrinting

#define SearchStream_IsPrinting(s)

Returns 1 (true) if the Search has been set up in a stream by Search_PrintInitFile() or Search_PrintInitStr().

SearchStream_IsParsing

#define SearchStream_IsParsing(s)

Returns 1 (true) if the Search has been setup in a stream by Search_ParseInitFile() or Search_ParseInitStr().

Filtering Search Objects

To support targeted parsing and printing, you can use the attribute filtering mechanisms in the Search stream. For each Search stream object, you can associate a list of allowed attributes. When printing a Search stream, only the attributes that match the allowed attributes will be printed. When parsing a Search stream, only the attributes that match the allowed attributes will be parsed.

SearchStream_IsAllowed() and SearchStream_SetAllowed() allow attributes, while SearchStream_IsDenied() and SearchStream_SetDenied() deny attributes. You can allow or deny an attribute, but not both.

SearchStream_IsAllowed

NSAPI_PUBLIC boolean_t SearchStream_IsAllowed
(SearchStream *ss, char *attribute);

Indicates that the given attribute is allowed (that is, it can be printed or parsed). 

SearchStream_SetAllowed

NSAPI_PUBLIC int SearchStream_SetAllowed
(SearchStream *ss, char *allowed_attrs[])

Sets all the attributes in the allowed_attrs array to allowed.

SearchStream_SetDenied

NSAPI_PUBLIC int SearchStream_SetDenied
(SearchStream *ss, char *denied_attrs[]);

Sets all the attributes in the allowed_attrs array to be denied (that is, they cannot be parsed or printed).

SearchStream_GetAllowed

NSAPI_PUBLIC char **SearchStream_GetAllowed
(SearchStream *ss)

Returns an array of all the attributes that are allowed. 

SearchStream_GetDenied

NSAPI_PUBLIC char **SearchStream_GetDenied
(SearchStream *ss);

Returns an array of all the attributes that are denied. 

Memory Buffer Management

You can use Search buffers in parsing or printing routines. They take care of memory allocation for inserting and appending. They are basically memory blocks that are easy for Search routines to use.

A Search Buffer is represented in a SearchBuffer structure, that is created with the SearchBuffer_Create() function and freed with the SearchBuffer-Free() function. The SearchBuffer structure provides the append(), increase(), and reset() functions for manipulating the data in the buffer.

SearchBuffer_Create

NSAPI_PUBLIC SearchBuffer *
SearchBuffer_Create(int default_sz);

The SearchBuffer is used in Search_PrintInitStr(SearchBuffer *memory). Before you can print Search to memory, you need to create a buffer for output.

SearchBuffer_Free

NSAPI_PUBLIC void SearchBuffer_Free
(SearchBuffer *sb);

Releases the memory buffer created by SearchBuffer_Create().

append

void (*append)(SearchBuffer *sb, 
char *data, int n)

Copies n bytes of data into the buffer.

increase

void (*increase)(SearchBuffer 
*sb, int add_n)

Increases the size of the data buffer by add_n bytes.

reset

void (*reset)(SearchBuffer *sb)

Resets the size of the data buffer and invalidates all currently valid data. A buffer can be reused by resetting it this way.