Sun Java System Portal Server 7 Developer's Guide

Chapter 21 SOIF API

This chapter contains the functions and objects defined in the soif.h header file. It contains the following sections:

Functions and Objects

Table 21–1 provides an alphabetized version of the functions and objects for your reference.

Table 21–1 Alphabetized Functions and Objects Defined in the soif.h File

SOIF function or object  

Category 

append, increase, reset, SOIFBuffer_Create, SOIFBuffer_Free

Memory Buffer Management 

SOIF_Apply, SOIF_Create, SOIF_Find, SOIF_Findval, SOIF_Free, SOIF_AttributeCompare, SOIF_GetAttributeSize, SOIF_GetTotalSize, SOIF_GetValueCount, SOIF_GetValueSize, SOIF_InsertAVP, SOIF_Merge, SOIF_Remove

SOIF Structure 

SOIF_AttributeCompare, SOIF_InsertStr, SOIF_Rename, SOIF_Replace, SOIF_ReplaceMV, SOIF_ReplaceStr, SOIF_SqueezeMV, SOIFAVPair_Create, SOIFAVPair_Free

Attribute-Value Pair Routines 

SOIF_AttributeCompareMV, SOIF_Contains, SOIF_DeleteMV, SOIF_FindvalMV, SOIF_Insert, SOIF_InsertMV, SOIF_IsMVAttribute, SOIF_MVAttributeParse, SOIFAVPair_IsMV, SOIFAVPair_NthValid, SOIFAVPair_NthValue, SOIFAVPair_NthVsize

Multi-valued Attribute Routines 

SOIF_ParseInitFile, SOIF_ParseInitStr, SOIF_PrintInitFile, SOIF_PrintInitFn, SOIF_PrintInitStr, SOIFStream_Finish, SOIFStream_GetAllowed, SOIFStream_GetDenied, SOIFStream_IsAllowed, SOIFStream_IsEOS, SOIFStream_IsParsing, SOIFStream_IsPrinting, SOIFStream_Parse, SOIFStream_Print, SOIFStream_SetAllowed, SOIFStream_SetDenied, SOIFStream_SetFinishFn

Stream Routines for Parsing and Printing SOIFs 

SOIF Structure

A SOIF has a schema-name and it associates a URL with a collection of attribute- value pairs. The schema-name identifies how to interpret the attribute-value pairs. SOIF supports text and binary data, and attributes can have multiple values.

An example SOIF is the following:


@DOCUMENT { http://www.siroe.com/
    title{17}:   Welcome to Siroe!
    author{13}:  Dot Punchcard
}

A SOIF object has URL and schema-name fields to store its URL and schema_name:

char *url;

/* The URL */

char *schema_name;

/* The Schema-Name, such as @document or @RDMHeader*/

A SOIF object contains a collection of SOIFAVPair objects, which each contain an attribute and one or more values. To access attribute values in a SOIF, use SOIF_find() to retrieve the AVPair for the given attribute, or use SOIF_findval() to retrieve the value string for a given attribute. You must use all lowercase for attribute names for find*(), since only exact attribute name lookups are supported.

You can create SOIF objects by using the SOIF_create() function. You can also read SOIF objects from a SOIF stream.

SOIF_Create
NSAPI_PUBLIC SOIF *SOIF_Create(char *schema_name, char *url)

Creates a SOIF structure with the given schema name and URL.

SOIF_Free
NSAPI_PUBLIC void SOIF_Free(SOIF *)

Frees the given SOIF structure.

SOIF_GetTotalSize
NSAPI_PUBLIC int SOIF_GetTotalSize(SOIF *s)

Gets the estimated total size of the SOIF in bytes.

SOIF_GetAttributeCount
NSAPI_PUBLIC int SOIF_GetAttributeCount(SOIF *s)

Gets the number of attributes in the SOIF.

SOIF_GetAttributeSize
NSAPI_PUBLIC int SOIF_GetAttributeSize(SOIF *s)

Gets the size of the attributes only.

SOIF_GetValueSize
NSAPI_PUBLIC int SOIF_GetValueSize(SOIF *s)

Gets the size of the values only.

SOIF_GetValueCount
NSAPI_PUBLIC int SOIF_GetValueCount(SOIF *s)

Gets the number of values only.

SOIF_Merge
NSAPI_PUBLIC int SOIF_Merge(SOIF *dst, SOIF *src);

Use this function to merge two SOIF objects (perform a Union of their attribute-values). It returns non-zero on error; otherwise, returns zero and the ”dst’ SOIF object contains all the attribute-value pairs from the ”src’ SOIF object.

If the ”dst’ object contains the same attribute as ”src’, then the attribute becomes a multi-valued attribute and all of the values are copied over to ”dst’. Only multi-valued attributes are copied over. For single-value attributes, discard the value in ”dst’. Currently only “classification” is a multi-valued attribute.

SOIF_Find
#define SOIF_Find(soif, attribute-name)

Retrieves the AVPair for the given attribute in the given soif. For example, the following statement gets the AVPair for the title attribute in the soif s:

SOIFAVpair avp=SOIF_Find(s, "title");
SOIF_Findval
#define SOIF_Findval(soif, attribute-name)

Retrieves the value string for the given attribute in the given soif. For example, the following statement prints the value of the title attribute of the soif s:

printf("Title = %s\\n", SOIF_Findval(s, "title"));
SOIF_Remove
#define SOIF_Remove(soif, attribute-name)

Removes the given attribute from the given soif.

SOIF_Insert
#define SOIF_Insert(soif, attribute-name, value, value-size)

Inserts the given attribute and the value of the given size as an AVPair into the soif.

SOIF_InsertAVP
#define SOIF_InsertAVP(soif, avpair)

Inserts the given AVPair into the given soif.

SOIF_Apply
#define SOIF_Apply(soif, function, user-date)

Applies the given function with the given argument (user-data) to each AVPair in the given soif. For example:


void print_av(SOIF *s, SOIFAVPair *avp, void *unused)
{printf("%s = %s\\n", avp->attribute, avp->value);}

/* print every attribute and value in the soif s */
SOIF_Apply(s, print_av, NULL);

Attribute-Value Pair Routines

Attribute-value pairs contain an attribute and an associated value. The value often is a simple null-terminated string; however, the value can also be binary data. Attribute-value pairs are stored as SOIFAVPair structures.

The important fields in a SOIFAVPair structure are the following:

char *attribute;

Attribute string; ”\\0’ terminated

char *value;

Primary value; may be ”\\0’ terminated

size_t vsize;

Number of bytes (8 bits) for primary value

char **values;

Multiple values for multivalued attributes

size_t *vsizes;

The sizes for the values

int nvalues;

Number of values associated with attribute

int last_slot;

Last valid slot - array may contain holes

SOIFAVPair_Create
NSAPI_PUBLIC SOIFAVPair * SOIFAVPair_Create(char *a, char *v, int vsz);

Creates an AVPair structure with the given attribute a and value v. The value v is a buffer of vsz bytes.

SOIFAVPair_Free
NSAPI_PUBLIC void SOIFAVPair_Free(SOIFAVPair *avp);

Frees the memory used by the given SOIFAVPair structure

SOIF_Replace
NSAPI_PUBLIC int SOIF_Replace(SOIF *s, char *att, char *val, int valsz);

Replaces the value of an existing attribute att with a new value val of size valsz in the SOIF s.

SOIF_InsertStr
#define SOIF_InsertStr(soif, attribute, value)

Inserts the given attribute with the given value into the soif.

SOIF_ReplaceStr
#define SOIF_ReplaceStr(soif, attribute, value)

Replaces the existing value of the given attribute in the soif with the given value.

SOIF_Rename
NSAPI_PUBLIC int SOIF_Rename(SOIF *s, char *old_attr, char *new_attr);

Renames the given attribute to the given new name.

SOIF_AttributeCompare
NSAPI_PUBLIC int SOIF_AttributeCompare(const char *a1, const char *a2);

Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero if they are different. Case (upper and lower) and trailing -s are ignored when comparing attribute names. The following table illustrates the results of comparing some attribute names.

AttibuteA 

AttributeB 

Does SOIF_AttributeCompare() consider them to be the same?

title 

Title 

yes 

title 

Title 

yes 

title 

title 

yes 

title 

title-page 

no 

title 

title 

no 

author 

title 

no 

Multi-valued Attribute Routines

A SOIF attribute can have multiple values. SOIF supports the convention of using -NNN to indicate a multivalued attribute. For example, Title-1, Title-2, Title-3, and so on. The -NNN do not need to be sequential positive integers.

The Search Engine supports searching on multi-valued attributes such as the classification attribute. In SOIF representation, it is represented using classification-1, classification-2, and so on. For example:

classification-1{5}: robot
classification-2{5}: siroe
classification-3{10}: web crawler
SOIF_AttributeCompareMV
NSAPI_PUBLIC int SOIF_AttributeCompareMV(const char *a1, const char *a2);

Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero if they are different. If neither of the attributes is multi-valued then use above routine SOIF_AttributeCompare(). If one or both of the attributes are multi-value, use the base name of the multi-valued attribute for comparison. The base name of a multi-valued attribute is the name portion before -. For example, the base name of classification-3 is classification.

SOIF_MVAttributeParse
NSAPI_PUBLIC int SOIF_MVAttributeParse(char *a)

Returns the multi-valued number of the given attribute, and strips the attribute string of its -NNN indicator; otherwise, returns zero in the case of a normal attribute name. For example, classification-3 returns the number 3.

SOIF_IsMVAttribute
NSAPI_PUBLIC char *SOIF_IsMVAttribute(const char *a);

Returns NULL if the given attribute is not a multi-valued attribute; otherwise returns a pointer to where the multi-valued number occurs in the attribute string. For example, for the multi-valued attribute classification-3, it will return the pointer to 3.

SOIF_InsertMV
NSAPI_PUBLIC int SOIF_InsertMV(SOIF *s, char *a, int slot, char *v, int vsz, int useval)

Inserts a new value v at index slot for the given attribute a (in non-multivalue form). If set, the useval flag tells the function to use the given value buffer rather than creating its own copy.

For example:

SOIF_InsertMV(s, "classification", 3, "web crawler", strlen("web crawler");

Inserts

classification-3{10}: web crawler
SOIF_ReplaceMV
NSAPI_PUBLIC int SOIF_ReplaceMV(SOIF *s, char *a, int slot, char *v, int vsz, int useval);
SOIF_DeleteMV
NSAPI_PUBLIC int SOIF_DeleteMV(SOIF *s, char *a, int slot)

Deletes the value at the index slot in the attribute a. For example:

SOIF_DeleteMV(s, "classification", 3)

Deletes classification-3.

SOIF_FindvalMV
NSAPI_PUBLIC const char *SOIF_FindvalMV(SOIF *s, const char *a, int slot)

Finds the value at the index slot in the attribute a. For example:

SOIF_FindvalMV(s, "classification", 3)

Returns web crawler (using the previous example).

SOIF_SqueezeMV
NSAPI_PUBLIC void SOIF_SqueezeMV(SOIF *s)

Forces a renumbering to ensure that the multi-value indexes are sequentially increasing (for example, 1, 2, 3,...). This function can be used to fill in any holes that might have occurred during SOIF_InsertMV() invocations. For example, to insert values explicitly for the multivalue attribute author-*:


SOIF_InsertMV(s, "author", 1, "John", 4, 0);
SOIF_InsertMV(s, "author", 2, "Kevin", 5, 0);
SOIF_InsertMV(s, "author", 6, "Darren", 6, 0);
SOIF_InsertMV(s, "author", 9, "Tommy", 5, 0);
SOIF_FindvalMV(s, "author", 9); /* == "Tommy" */
SOIF_SqueezeMV(s);
SOIF_FindvalMV(s, "author", 9); /* == NULL */
SOIF_FindvalMV(s, "author", 4); /* == "Tommy" */
SOIFAVPair_IsMV
#define SOIFAVPair_IsMV(avp)

Use this to determine if the AVPair has multiple values or not.

SOIFAVPair_NthValid
#define SOIFAVPair_NthValid(avp,n)

Use this to determine if the Nth value is valid or not.

SOIFAVPair_NthValue
#define SOIFAVPair_NthValue(avp,n)   ((avp)->values[n])

Use this to access the Nth value. For example:


for (i = 0; i <= avp->last_slot; i++)
  if (SOIFAVPair_NthValid(avp, i))
    printf("%s = %s\\n", avp->attribute,
      SOIFAVPair_NthValue(avp, i));
SOIFAVPair_NthVsize
#define SOIFAVPair_NthVsize(avp,n)   ((avp)->vsizes[n])

Use this to get the size of the Nth value.

SOIF_Contains
NSAPI_PUBLIC boolean_t SOIF_Contains(SOIF *s, char *a, char *v, int vsz);

Indicates if the given attribute contains the given value. It returns B_TRUE if the value matches one or more of the values of the attribute a in the given SOIF s.

Stream Routines for Parsing and Printing SOIFs

A SOIFStream contains one or more SOIF objects.

The general approach is that you use SOIF streams to create and process streams of many SOIF objects. Given a SOIF stream, you can parse it to get the SOIF objects from it. Use the parse() routine to get the next SOIF object in a SOIF stream. You can use SOIFStream_IsEOS() to check whether the last object has been parsed.

You can use filtering functions for a SOIF stream to specify that certain SOIF attributes are allowed or denied. If an attribute is allowed, you can parse and print that attribute for SOIF objects in the stream. If it is denied, you cannot parse or print that attribute of SOIF objects in the stream.

SOIF streams can be disk or memory based.

When you create a SOIFStream, you need to specify if you will be printing or parsing the SOIF stream, and if you will be using a memory- or disk-based stream. The functions you need to use will depend on what you will be doing with the SOIF stream.

For creating a SOIF streams into which you will be printing SOIFS, the functions are the following:

SOIF_PrintInitFile()

Creates a disk-based stream ready for printing.

SOIF_PrintInitStr()

Creates a memory-based stream ready for printing.

SOIF_PrintInitFn()

Creates a generic application-defined stream ready for printing. The given ”write_fn’ is used to print the stream.

To create SOIF stream from a file or a string containing SOIF, use the following functions:

SOIF_ParseInitFile()

Creates a disk-based stream ready for parsing. The stream is created from an input containing SOIF syntax.

SOIF_ParseInitStr()

Creates a memory-based stream ready for parsing. The stream is created from an input containing SOIF syntax.

SOIFStream objects have a caller-data field, which you can use as you like:

void *caller_data;   /* hook to be used by caller */

Use SOIFStream_Parse() to get the SOIF objects from the SOIF stream, and use SOIFStream_Print() to write SOIF objects to the SOIF stream.

When you’ve finished with the stream, close it by using SOIFStream_Finish(). Use SOIFStream_SetFinishFn() to trigger the given finish_fn function.

The following example code takes a SOIF stream in stdin and prints each SOIF in the stream to stdout. Notice that this code uses SOIF_ParseInitFile() to create the SOIFStream to parse the input file, and uses SOIF_PrintInitFile() to create the stream to print the SOIFs to stdout.


SOIFStream *soifin = SOIF_ParseInitFile(stdin);
SOIFStream *soifout = SOIF_PrintInitFile(stdout);
SOIF *s;
while (!SOIFStream_IsEOS(soifin)) {
    if ((s = SOIFStream_Parse(soifin)) {
        SOIFStream_print(soifout, s);
        SOIF_Free(s);
    }
}
SOIF_PrintInitFile
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitFile(FILE *file)

Creates a disk-based stream ready for printing.

SOIF_PrintInitStr
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitStr(SOIFBuffer *memory)

Creates a memory-based stream ready for printing.

SOIF_PrintInitFn
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitFn(int (*write_fn)(void *data,char *buf, int bufsz), void *data)

Creates a generic application-defined stream ready for printing. The given write_fn is used to print the stream.

This function allows you to hook up your own routine for printing.

SOIF_ParseInitFile
NSAPI_PUBLIC SOIFStream *SOIF_ParseInitFile(FILE *fp)

Creates a disk-based stream ready for parsing. The file must contain SOIF-formatted data. The function reads SOIF data from the file object fp.

SOIF_ParseInitStr
NSAPI_PUBLIC SOIFStream *SOIF_ParseInitStr(char *buf, int bufsz)

Creates a memory-based stream ready for parsing. The character buffer must contain SOIF-formatted data.

SOIFStream_Finish
NSAPI_PUBLIC int SOIFStream_Finish(SOIFStream *)

Closes the stream when you have finished with it.

SOIFStream_SetFinishFn
NSAPI_PUBLIC int SOIFStream_SetFinishFn(SOIFStream *, int (*finish_fn)(SOIFStream *))

Allows you to hook up a function for cleaning up after the SOIF stream finishes its business. The finish_fn will be called when SOIFStream_Finish() has finished executing.

SOIFStream_Print
#define SOIFStream_Print(ss, s)

Prints another SOIF object to the SOIF stream ss. Returns 0 on success, or non-zero on error.

SOIFStream_Parse
#define SOIFStream_Parse(ss)

Parses and returns the next SOIF object in the SOIF stream.

SOIFStream_IsEOS
#define SOIFStream_IsEOS(s)

Returns 1 (true) if the SOIF stream has been exhausted.

SOIFStream_IsPrinting
#define SOIFStream_IsPrinting(s)

Returns 1 (true) if the SOIF has been set up in a stream by SOIF_PrintInitFile() or SOIF_PrintInitStr().

SOIFStream_IsParsing
#define SOIFStream_IsParsing(s)

Returns 1 (true) if the SOIF has been setup in a stream by SOIF_ParseInitFile() or SOIF_ParseInitStr().

Filtering SOIF Objects

To support targeted parsing and printing, you can use the attribute filtering mechanisms in the SOIF stream. For each SOIF stream object, you can associate a list of allowed attributes. When printing a SOIF stream, only the attributes that match the allowed attributes will be printed. When parsing a SOIF stream, only the attributes that match the allowed attributes will be parsed.

SOIFStream_IsAllowed() and SOIFStream_SetAllowed() allow attributes, while SOIFStream_IsDenied() and SOIFStream_SetDenied() deny attributes. You can allow or deny an attribute, but not both.

SOIFStream_IsAllowed
NSAPI_PUBLIC boolean_t SOIFStream_IsAllowed(SOIFStream *ss, char *attribute);

Indicates that the given attribute is allowed (that is, it can be printed or parsed).

SOIFStream_SetAllowed
NSAPI_PUBLIC int SOIFStream_SetAllowed(SOIFStream *ss, char *allowed_attrs[])

Sets all the attributes in the allowed_attrs array to allowed.

SOIFStream_SetDenied
NSAPI_PUBLIC int SOIFStream_SetDenied(SOIFStream *ss, char *denied_attrs[]);

Sets all the attributes in the allowed_attrs array to be denied (that is, they cannot be parsed or printed).

SOIFStream_GetAllowed
NSAPI_PUBLIC char **SOIFStream_GetAllowed(SOIFStream *ss)

Returns an array of all the attributes that are allowed.

SOIFStream_GetDenied
NSAPI_PUBLIC char **SOIFStream_GetDenied(SOIFStream *ss);

Returns an array of all the attributes that are denied.

Memory Buffer Management

You can use SOIF buffers in parsing or printing routines. They take care of memory allocation for inserting and appending. They are basically memory blocks that are easy for SOIF routines to use.

A SOIF Buffer is represented in a SOIFBuffer structure, that is created with the SOIFBuffer_Create() function and freed with the SOIFBuffer-Free() function. The SOIFBuffer structure provides the append(), increase(), and reset() functions for manipulating the data in the buffer.

SOIFBuffer_Create
NSAPI_PUBLIC SOIFBuffer *SOIFBuffer_Create(int default_sz);

The SOIFBuffer is used in SOIF_PrintInitStr(SOIFBuffer *memory). Before you can print SOIF to memory, you need to create a buffer for output.

SOIFBuffer_Free
NSAPI_PUBLIC void SOIFBuffer_Free(SOIFBuffer *sb);

Releases the memory buffer created by SOIFBuffer_Create().

append
void (*append)(SOIFBuffer *sb, char *data, int n)

Copies n bytes of data into the buffer.

increase
void (*increase)(SOIFBuffer *sb, int add_n)

Increases the size of the data buffer by add_n bytes.

reset
void (*reset)(SOIFBuffer *sb)

Resets the size of the data buffer and invalidates all currently valid data. A buffer can be reused by resetting it this way.