This chapter contains the functions and objects defined in the soif.h header file. It contains the following sections:
Table 21–1 provides an alphabetized version of the functions and objects for your reference.
Table 21–1 Alphabetized Functions and Objects Defined in the soif.h File
SOIF function or object |
Category |
---|---|
append, increase, reset, SOIFBuffer_Create, SOIFBuffer_Free |
Memory Buffer Management |
SOIF_Apply, SOIF_Create, SOIF_Find, SOIF_Findval, SOIF_Free, SOIF_AttributeCompare, SOIF_GetAttributeSize, SOIF_GetTotalSize, SOIF_GetValueCount, SOIF_GetValueSize, SOIF_InsertAVP, SOIF_Merge, SOIF_Remove |
SOIF Structure |
SOIF_AttributeCompare, SOIF_InsertStr, SOIF_Rename, SOIF_Replace, SOIF_ReplaceMV, SOIF_ReplaceStr, SOIF_SqueezeMV, SOIFAVPair_Create, SOIFAVPair_Free |
Attribute-Value Pair Routines |
SOIF_AttributeCompareMV, SOIF_Contains, SOIF_DeleteMV, SOIF_FindvalMV, SOIF_Insert, SOIF_InsertMV, SOIF_IsMVAttribute, SOIF_MVAttributeParse, SOIFAVPair_IsMV, SOIFAVPair_NthValid, SOIFAVPair_NthValue, SOIFAVPair_NthVsize |
Multi-valued Attribute Routines |
SOIF_ParseInitFile, SOIF_ParseInitStr, SOIF_PrintInitFile, SOIF_PrintInitFn, SOIF_PrintInitStr, SOIFStream_Finish, SOIFStream_GetAllowed, SOIFStream_GetDenied, SOIFStream_IsAllowed, SOIFStream_IsEOS, SOIFStream_IsParsing, SOIFStream_IsPrinting, SOIFStream_Parse, SOIFStream_Print, SOIFStream_SetAllowed, SOIFStream_SetDenied, SOIFStream_SetFinishFn |
Stream Routines for Parsing and Printing SOIFs |
A SOIF has a schema-name and it associates a URL with a collection of attribute- value pairs. The schema-name identifies how to interpret the attribute-value pairs. SOIF supports text and binary data, and attributes can have multiple values.
An example SOIF is the following:
@DOCUMENT { http://www.siroe.com/ title{17}: Welcome to Siroe! author{13}: Dot Punchcard } |
A SOIF object has URL and schema-name fields to store its URL and schema_name:
/* The URL */
/* The Schema-Name, such as @document or @RDMHeader*/
A SOIF object contains a collection of SOIFAVPair objects, which each contain an attribute and one or more values. To access attribute values in a SOIF, use SOIF_find() to retrieve the AVPair for the given attribute, or use SOIF_findval() to retrieve the value string for a given attribute. You must use all lowercase for attribute names for find*(), since only exact attribute name lookups are supported.
You can create SOIF objects by using the SOIF_create() function. You can also read SOIF objects from a SOIF stream.
NSAPI_PUBLIC SOIF *SOIF_Create(char *schema_name, char *url)
Creates a SOIF structure with the given schema name and URL.
NSAPI_PUBLIC void SOIF_Free(SOIF *)
Frees the given SOIF structure.
NSAPI_PUBLIC int SOIF_GetTotalSize(SOIF *s)
Gets the estimated total size of the SOIF in bytes.
NSAPI_PUBLIC int SOIF_GetAttributeCount(SOIF *s)
Gets the number of attributes in the SOIF.
NSAPI_PUBLIC int SOIF_GetAttributeSize(SOIF *s)
Gets the size of the attributes only.
NSAPI_PUBLIC int SOIF_GetValueSize(SOIF *s)
Gets the size of the values only.
NSAPI_PUBLIC int SOIF_GetValueCount(SOIF *s)
Gets the number of values only.
NSAPI_PUBLIC int SOIF_Merge(SOIF *dst, SOIF *src);
Use this function to merge two SOIF objects (perform a Union of their attribute-values). It returns non-zero on error; otherwise, returns zero and the ”dst’ SOIF object contains all the attribute-value pairs from the ”src’ SOIF object.
If the ”dst’ object contains the same attribute as ”src’, then the attribute becomes a multi-valued attribute and all of the values are copied over to ”dst’. Only multi-valued attributes are copied over. For single-value attributes, discard the value in ”dst’. Currently only “classification” is a multi-valued attribute.
#define SOIF_Find(soif, attribute-name)
Retrieves the AVPair for the given attribute in the given soif. For example, the following statement gets the AVPair for the title attribute in the soif s:
SOIFAVpair avp=SOIF_Find(s, "title");
#define SOIF_Findval(soif, attribute-name)
Retrieves the value string for the given attribute in the given soif. For example, the following statement prints the value of the title attribute of the soif s:
printf("Title = %s\\n", SOIF_Findval(s, "title"));
#define SOIF_Remove(soif, attribute-name)
Removes the given attribute from the given soif.
#define SOIF_Insert(soif, attribute-name, value, value-size)
Inserts the given attribute and the value of the given size as an AVPair into the soif.
#define SOIF_InsertAVP(soif, avpair)
Inserts the given AVPair into the given soif.
#define SOIF_Apply(soif, function, user-date)
Applies the given function with the given argument (user-data) to each AVPair in the given soif. For example:
void print_av(SOIF *s, SOIFAVPair *avp, void *unused) {printf("%s = %s\\n", avp->attribute, avp->value);} /* print every attribute and value in the soif s */ SOIF_Apply(s, print_av, NULL); |
Attribute-value pairs contain an attribute and an associated value. The value often is a simple null-terminated string; however, the value can also be binary data. Attribute-value pairs are stored as SOIFAVPair structures.
The important fields in a SOIFAVPair structure are the following:
Attribute string; ”\\0’ terminated
Primary value; may be ”\\0’ terminated
Number of bytes (8 bits) for primary value
Multiple values for multivalued attributes
The sizes for the values
Number of values associated with attribute
Last valid slot - array may contain holes
NSAPI_PUBLIC SOIFAVPair * SOIFAVPair_Create(char *a, char *v, int vsz);
Creates an AVPair structure with the given attribute a and value v. The value v is a buffer of vsz bytes.
NSAPI_PUBLIC void SOIFAVPair_Free(SOIFAVPair *avp);
Frees the memory used by the given SOIFAVPair structure
NSAPI_PUBLIC int SOIF_Replace(SOIF *s, char *att, char *val, int valsz);
Replaces the value of an existing attribute att with a new value val of size valsz in the SOIF s.
#define SOIF_InsertStr(soif, attribute, value)
Inserts the given attribute with the given value into the soif.
#define SOIF_ReplaceStr(soif, attribute, value)
Replaces the existing value of the given attribute in the soif with the given value.
NSAPI_PUBLIC int SOIF_Rename(SOIF *s, char *old_attr, char *new_attr);
Renames the given attribute to the given new name.
NSAPI_PUBLIC int SOIF_AttributeCompare(const char *a1, const char *a2);
Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero if they are different. Case (upper and lower) and trailing -s are ignored when comparing attribute names. The following table illustrates the results of comparing some attribute names.
AttibuteA |
AttributeB |
Does SOIF_AttributeCompare() consider them to be the same? |
---|---|---|
title |
Title |
yes |
title |
Title |
yes |
title |
title |
yes |
title |
title-page |
no |
title |
title |
no |
author |
title |
no |
A SOIF attribute can have multiple values. SOIF supports the convention of using -NNN to indicate a multivalued attribute. For example, Title-1, Title-2, Title-3, and so on. The -NNN do not need to be sequential positive integers.
The Search Engine supports searching on multi-valued attributes such as the classification attribute. In SOIF representation, it is represented using classification-1, classification-2, and so on. For example:
classification-1{5}: robot classification-2{5}: siroe classification-3{10}: web crawler
NSAPI_PUBLIC int SOIF_AttributeCompareMV(const char *a1, const char *a2);
Compares two attribute names. Returns 0 (zero) if they are equal, or non-zero if they are different. If neither of the attributes is multi-valued then use above routine SOIF_AttributeCompare(). If one or both of the attributes are multi-value, use the base name of the multi-valued attribute for comparison. The base name of a multi-valued attribute is the name portion before “-”. For example, the base name of classification-3 is classification.
NSAPI_PUBLIC int SOIF_MVAttributeParse(char *a)
Returns the multi-valued number of the given attribute, and strips the attribute string of its -NNN indicator; otherwise, returns zero in the case of a normal attribute name. For example, classification-3 returns the number 3.
NSAPI_PUBLIC char *SOIF_IsMVAttribute(const char *a);
Returns NULL if the given attribute is not a multi-valued attribute; otherwise returns a pointer to where the multi-valued number occurs in the attribute string. For example, for the multi-valued attribute classification-3, it will return the pointer to 3.
NSAPI_PUBLIC int SOIF_InsertMV(SOIF *s, char *a, int slot, char *v, int vsz, int useval)
Inserts a new value v at index slot for the given attribute a (in non-multivalue form). If set, the useval flag tells the function to use the given value buffer rather than creating its own copy.
For example:
SOIF_InsertMV(s, "classification", 3, "web crawler", strlen("web crawler");
Inserts
classification-3{10}: web crawler
NSAPI_PUBLIC int SOIF_ReplaceMV(SOIF *s, char *a, int slot, char *v, int vsz, int useval);
NSAPI_PUBLIC int SOIF_DeleteMV(SOIF *s, char *a, int slot)
Deletes the value at the index slot in the attribute a. For example:
SOIF_DeleteMV(s, "classification", 3)
Deletes classification-3.
NSAPI_PUBLIC const char *SOIF_FindvalMV(SOIF *s, const char *a, int slot)
Finds the value at the index slot in the attribute a. For example:
SOIF_FindvalMV(s, "classification", 3)
Returns web crawler (using the previous example).
NSAPI_PUBLIC void SOIF_SqueezeMV(SOIF *s)
Forces a renumbering to ensure that the multi-value indexes are sequentially increasing (for example, 1, 2, 3,...). This function can be used to fill in any holes that might have occurred during SOIF_InsertMV() invocations. For example, to insert values explicitly for the multivalue attribute author-*:
SOIF_InsertMV(s, "author", 1, "John", 4, 0); SOIF_InsertMV(s, "author", 2, "Kevin", 5, 0); SOIF_InsertMV(s, "author", 6, "Darren", 6, 0); SOIF_InsertMV(s, "author", 9, "Tommy", 5, 0); SOIF_FindvalMV(s, "author", 9); /* == "Tommy" */ SOIF_SqueezeMV(s); SOIF_FindvalMV(s, "author", 9); /* == NULL */ SOIF_FindvalMV(s, "author", 4); /* == "Tommy" */ |
#define SOIFAVPair_IsMV(avp)
Use this to determine if the AVPair has multiple values or not.
#define SOIFAVPair_NthValid(avp,n)
Use this to determine if the Nth value is valid or not.
#define SOIFAVPair_NthValue(avp,n) ((avp)->values[n])
Use this to access the Nth value. For example:
for (i = 0; i <= avp->last_slot; i++) if (SOIFAVPair_NthValid(avp, i)) printf("%s = %s\\n", avp->attribute, SOIFAVPair_NthValue(avp, i)); |
#define SOIFAVPair_NthVsize(avp,n) ((avp)->vsizes[n])
Use this to get the size of the Nth value.
NSAPI_PUBLIC boolean_t SOIF_Contains(SOIF *s, char *a, char *v, int vsz);
Indicates if the given attribute contains the given value. It returns B_TRUE if the value matches one or more of the values of the attribute a in the given SOIF s.
A SOIFStream contains one or more SOIF objects.
The general approach is that you use SOIF streams to create and process streams of many SOIF objects. Given a SOIF stream, you can parse it to get the SOIF objects from it. Use the parse() routine to get the next SOIF object in a SOIF stream. You can use SOIFStream_IsEOS() to check whether the last object has been parsed.
You can use filtering functions for a SOIF stream to specify that certain SOIF attributes are allowed or denied. If an attribute is allowed, you can parse and print that attribute for SOIF objects in the stream. If it is denied, you cannot parse or print that attribute of SOIF objects in the stream.
SOIF streams can be disk or memory based.
When you create a SOIFStream, you need to specify if you will be printing or parsing the SOIF stream, and if you will be using a memory- or disk-based stream. The functions you need to use will depend on what you will be doing with the SOIF stream.
For creating a SOIF streams into which you will be printing SOIFS, the functions are the following:
Creates a disk-based stream ready for printing.
Creates a memory-based stream ready for printing.
Creates a generic application-defined stream ready for printing. The given ”write_fn’ is used to print the stream.
To create SOIF stream from a file or a string containing SOIF, use the following functions:
Creates a disk-based stream ready for parsing. The stream is created from an input containing SOIF syntax.
Creates a memory-based stream ready for parsing. The stream is created from an input containing SOIF syntax.
SOIFStream objects have a caller-data field, which you can use as you like:
void *caller_data; /* hook to be used by caller */
Use SOIFStream_Parse() to get the SOIF objects from the SOIF stream, and use SOIFStream_Print() to write SOIF objects to the SOIF stream.
When you’ve finished with the stream, close it by using SOIFStream_Finish(). Use SOIFStream_SetFinishFn() to trigger the given finish_fn function.
The following example code takes a SOIF stream in stdin and prints each SOIF in the stream to stdout. Notice that this code uses SOIF_ParseInitFile() to create the SOIFStream to parse the input file, and uses SOIF_PrintInitFile() to create the stream to print the SOIFs to stdout.
SOIFStream *soifin = SOIF_ParseInitFile(stdin); SOIFStream *soifout = SOIF_PrintInitFile(stdout); SOIF *s; while (!SOIFStream_IsEOS(soifin)) { if ((s = SOIFStream_Parse(soifin)) { SOIFStream_print(soifout, s); SOIF_Free(s); } } |
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitFile(FILE *file)
Creates a disk-based stream ready for printing.
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitStr(SOIFBuffer *memory)
Creates a memory-based stream ready for printing.
NSAPI_PUBLIC SOIFStream *SOIF_PrintInitFn(int (*write_fn)(void *data,char *buf, int bufsz), void *data)
Creates a generic application-defined stream ready for printing. The given write_fn is used to print the stream.
This function allows you to hook up your own routine for printing.
NSAPI_PUBLIC SOIFStream *SOIF_ParseInitFile(FILE *fp)
Creates a disk-based stream ready for parsing. The file must contain SOIF-formatted data. The function reads SOIF data from the file object fp.
NSAPI_PUBLIC SOIFStream *SOIF_ParseInitStr(char *buf, int bufsz)
Creates a memory-based stream ready for parsing. The character buffer must contain SOIF-formatted data.
NSAPI_PUBLIC int SOIFStream_Finish(SOIFStream *)
Closes the stream when you have finished with it.
NSAPI_PUBLIC int SOIFStream_SetFinishFn(SOIFStream *, int (*finish_fn)(SOIFStream *))
Allows you to hook up a function for cleaning up after the SOIF stream finishes its business. The finish_fn will be called when SOIFStream_Finish() has finished executing.
#define SOIFStream_Print(ss, s)
Prints another SOIF object to the SOIF stream ss. Returns 0 on success, or non-zero on error.
#define SOIFStream_Parse(ss)
Parses and returns the next SOIF object in the SOIF stream.
#define SOIFStream_IsEOS(s)
Returns 1 (true) if the SOIF stream has been exhausted.
#define SOIFStream_IsPrinting(s)
Returns 1 (true) if the SOIF has been set up in a stream by SOIF_PrintInitFile() or SOIF_PrintInitStr().
#define SOIFStream_IsParsing(s)
Returns 1 (true) if the SOIF has been setup in a stream by SOIF_ParseInitFile() or SOIF_ParseInitStr().
To support targeted parsing and printing, you can use the attribute filtering mechanisms in the SOIF stream. For each SOIF stream object, you can associate a list of allowed attributes. When printing a SOIF stream, only the attributes that match the allowed attributes will be printed. When parsing a SOIF stream, only the attributes that match the allowed attributes will be parsed.
SOIFStream_IsAllowed() and SOIFStream_SetAllowed() allow attributes, while SOIFStream_IsDenied() and SOIFStream_SetDenied() deny attributes. You can allow or deny an attribute, but not both.
NSAPI_PUBLIC boolean_t SOIFStream_IsAllowed(SOIFStream *ss, char *attribute);
Indicates that the given attribute is allowed (that is, it can be printed or parsed).
NSAPI_PUBLIC int SOIFStream_SetAllowed(SOIFStream *ss, char *allowed_attrs[])
Sets all the attributes in the allowed_attrs array to allowed.
NSAPI_PUBLIC int SOIFStream_SetDenied(SOIFStream *ss, char *denied_attrs[]);
Sets all the attributes in the allowed_attrs array to be denied (that is, they cannot be parsed or printed).
NSAPI_PUBLIC char **SOIFStream_GetAllowed(SOIFStream *ss)
Returns an array of all the attributes that are allowed.
NSAPI_PUBLIC char **SOIFStream_GetDenied(SOIFStream *ss);
Returns an array of all the attributes that are denied.
You can use SOIF buffers in parsing or printing routines. They take care of memory allocation for inserting and appending. They are basically memory blocks that are easy for SOIF routines to use.
A SOIF Buffer is represented in a SOIFBuffer structure, that is created with the SOIFBuffer_Create() function and freed with the SOIFBuffer-Free() function. The SOIFBuffer structure provides the append(), increase(), and reset() functions for manipulating the data in the buffer.
NSAPI_PUBLIC SOIFBuffer *SOIFBuffer_Create(int default_sz);
The SOIFBuffer is used in SOIF_PrintInitStr(SOIFBuffer *memory). Before you can print SOIF to memory, you need to create a buffer for output.
NSAPI_PUBLIC void SOIFBuffer_Free(SOIFBuffer *sb);
Releases the memory buffer created by SOIFBuffer_Create().
void (*append)(SOIFBuffer *sb, char *data, int n)
Copies n bytes of data into the buffer.
void (*increase)(SOIFBuffer *sb, int add_n)
Increases the size of the data buffer by add_n bytes.
void (*reset)(SOIFBuffer *sb)
Resets the size of the data buffer and invalidates all currently valid data. A buffer can be reused by resetting it this way.