Sun Java logo     Previous      Contents      Index      Next     

Sun logo
Sun Java System Portal Server 6 2004Q2 Developer's Guide 

Chapter 20
Overview of the SOIF API

This chapter contains the following sections:


Resource descriptions in the search engine database are described in SOIF, and so are Resource Description Messages (RDMs) that processes can use to exchange resource descriptions across a network.

The SOIF API provides routines for creating and modifying SOIF objects in C.

The SOIF API is defined in the soif.h file in the following directory in your search engine installation directory:


This chapter is restricted to discussing the use of C functions that come with the search engine SOIF API. Therefore, it is strongly recommended that you have a basic understanding of the C programming language.


To correctly support all languages, it is important that all SOIF data should use the UTF-8 character set. Note that UTF-8 is fully backward compatible with 7-bit ASCII SOIF.

What is SOIF?

SOIF stands for Summary Object Interchange Format. It is a syntax that can be used in numerous situations. In particular, it is used to describe resource descriptions (RDs) in the search engine database.

The SOIF format is a basic attribute-value format. SOIF files look like text but should be treated as binary data and edited with care. SOIF files contain tabs, and many editors will convert tabs to spaces and corrupt the file. You can use SOIF-manipulation functions to create and modify SOIF objects so you do not have to write and edit them manually.

The following sample SOIF describes a document, whose title is “Rescuing English Springer Spaniels”, whose author is “Jocelyn Becker” and whose URL is


    title{34}: Rescuing English Springer Spaniels

    author{14}: Jocelyn Becker


Each SOIF object has a schema-name (or template type) and an associated URL, and it contains a list of attribute-value pairs. In this case, the schema name is @DOCUMENT, which indicates this resource is a document. Title and author are both attribute names, and you can see that each attribute has a value.

Using the SOIF API

The SOIF API is defined in the soif.h header file in directory portal-server-install-root/SUNWps/sdk/rdm/include.

The SOIF API defines structures and functions for working with SOIF objects. For example, the following code uses the functions SOIF_Create() and SOIF_InsertStr() to create a SOIF and add some attribute-value pairs to it:

SOIF mysoif=SOIF_Create("DOCUMENT", "http://varrius/doc.htm");

SOIF_InsertStr(mysoif, "title", "All About Style Sheets");

SOIF_InsertStr(mysoif, "author", "Robin Styles");

SOIF_InsertStr(mysoif, "description", "All you need to know about style sheets");

These commands create a SOIF like the following example:

@document { http://varrius/doc.htm

    title{22}: All About Style Sheets

    author{12}: Robin Styles

    description{38}: All you need to know about style sheets


Each SOIF object contains attribute-value pairs, which are each represented as SOIFAVPair objects. Using the SOIF API, you can get and set values of attributes, you can create and delete attribute-value pairs, you can change the values of attributes, and you can add values to existing attributes. (Some attributes can have multiple values.)

Multiple SOIF objects can be grouped together into SOIF streams, which are represented by SOIFStream objects. A SOIFStream object provides functionality for handling a stream of SOIF objects. For example, you can use the stream to filter attributes, and print the desired attributes for every SOIF in the stream.

Thus, the relevant data structures when using the SOIF API include:

An Introductory Example

You will find several examples of the use of the SOIF API in portal-server-install-root/SUNWps/sdk/rdm/examples.

This section discusses an example that is similar to (but not necessarily identical to) example1.c. It shows how to iterate through a SOIF stream and print the URL and number of attributes of each SOIF in the stream.

This example assumes that you have already created a file containing a SOIF stream which is available on stdin. For example, you could have created a SOIF stream containing one or more RDs from the search engine database, which you would do by using the routines in RDM.h.

This example uses SOIF_ParseInitFile() to create a SOIF stream from the standard input.

Code Example 20-1  Simple SOIF Stream Parsing Example  

/* Example 1 - Simple SOIF Stream Parsing */

#include <stdio.h>

#include <stdlib.h>

#include “soif.h”

int main(int argc, char *argv[])


/* Define a SOIFStream and SOIF */

SOIFStream *ss;

SOIF *s;

char *titleptr;

/* Open a SOIF stream that gets its SOIF from stdin */

ss = SOIF_ParseInitFile(stdin);

/* SOIFStream_IsEOS() checks if this is the end of the stream */

while (!SOIFStream_IsEOS(ss)) {

if (!(s = SOIFStream_Parse(ss)))

/* Exit the loop if the SOIF is invalid */


/* Print the URL for each SOIF (will be “-” if there is no URL)*/

printf(“URL = %s\n”, s->url);

/* Print the title if it exists. */

titleptr = SOIF_Findval(s, “title”);

printf(“Title = %s\n”, titleptr ? titleptr : “(none)”)

/* Print the number of attributes in the SOIF*/

printf(“# of Attributes = %d\n”, SOIF_GetAttributeCount(s));

/* release the memory used by the SOIF */



/* Close the SOIFStream and exit */




Getting Search Server Database Contents as a SOIFStream

You can retrieve the entire contents of the search engine database as a SOIF stream by using the rdmgr utility. The rdmgr utility must be run in a search-enabled Sun Java System Portal Server software instance directory. The default is WebContainer/portal.

From the WebContainer/portal directory, run the following command:

/var/opt/SUNWps/bin/rdmgr -U

Be sure that the environment variable LD_LIBRARY_PATH to portal-server-install-root/lib.

This command prints the database contents as a SOIFStream. You can pipe the output to a program that uses SOIFStream routines to parse the SOIFs in the stream.

Previous      Contents      Index      Next     

Copyright 2004 Sun Microsystems, Inc. All rights reserved.