This chapter contains the functions and objects defined in the search.h header file. It contains the following sections:
The following table provides an alphabetized version of the functions and objects for your reference.
Table 24–1 Alphabetized Functions and Objects Defined in the search.h File
Search function or object |
Category |
---|---|
append, increase, reset, SearchBuffer_Create, SearchBuffer_Free |
Memory Buffer Management |
Search_Apply, Search_Create, Search_Find, Search_Findval, Search_Free, Search_AttributeCompare, Search_GetAttributeSize, Search_GetTotalSize, Search_GetValueCount, Search_GetValueSize, Search_InsertAVP, Search_Merge, Search_Remove |
Search Structure |
Search_AttributeCompare, Search_InsertStr, Search_Rename, Search_Replace, Search_ReplaceMV, Search_ReplaceStr, Search_SqueezeMV, SearchAVPair_Create, SearchAVPair_Free |
Attribute-Value Pair Routines |
Search_AttributeCompareMV, Search_Contains, Search_DeleteMV, Search_FindvalMV, Search_Insert, Search_InsertMV, Search_IsMVAttribute, Search_MVAttributeParse, SearchAVPair_IsMV, SearchAVPair_NthValid, SearchAVPair_NthValue, SearchAVPair_NthVsize |
Multi-valued Attribute Routines |
Search_ParseInitFile, Search_ParseInitStr, Search_PrintInitFile, Search_PrintInitFn, Search_PrintInitStr, SearchStream_Finish, SearchStream_GetAllowed, SearchStream_GetDenied, SearchStream_IsAllowed, SearchStream_IsEOS, SearchStream_IsParsing, SearchStream_IsPrinting, SearchStream_Parse, SearchStream_Print, SearchStream_SetAllowed, SearchStream_SetDenied, SearchStream_SetFinishFn |
Stream Routines for Parsing and Printing Searches |
A Search has a schema-name and it associates a URL with a collection of attribute- value pairs. The schema-name identifies how to interpret the attribute-value pairs. Search supports text and binary data, and attributes can have multiple values.
Example for Search:
@DOCUMENT { http://www.siroe.com/ title{17}: Welcome to Siroe! author{13}: Dot Punchcard } |
A Search object has URL and schema-name fields to store its URL and schema_name:
/* The URL */
/* The Schema-Name, such as @document or @RDMHeader*/
A Search object contains a collection of SearchAVPair objects, which each contain an attribute and one or more values. To access attribute values in a Search, use Search_find() to retrieve the AVPair for the given attribute, or use Search_findval() to retrieve the value string for a given attribute. You must use all lowercase for attribute names for find*(), since only exact attribute name lookups are supported.
You can create Search objects by using the Search_create() function. You can also read Search objects from a Search stream.
NSAPI_PUBLIC Search *Search_Create (char *schema_name, char *url)
Creates a Search structure with the given schema name and URL.
NSAPI_PUBLIC void Search_Free(Search *)
Frees the given Search structure.
NSAPI_PUBLIC int Search_GetTotalSize(Search *s)
Gets the estimated total size of the Search in bytes.
NSAPI_PUBLIC int Search_GetAttributeCount(Search *s)
Gets the number of attributes in the Search.
NSAPI_PUBLIC int Search_GetAttributeSize(Search *s)
Gets the size of the attributes only.
NSAPI_PUBLIC int Search_GetValueSize(Search *s)
Gets the size of the values only.
NSAPI_PUBLIC int Search_GetValueCount(Search *s)
Gets the number of values only.
NSAPI_PUBLIC int Search_Merge(Search *dst, Search *src);
Use this function to merge two Search objects (perform a Union of their attribute-values). It returns non-zero on error; otherwise, returns zero and the ”dst’ Search object contains all the attribute-value pairs from the ”src’ Search object.
If the ”dst’ object contains the same attribute as ”src’, then the attribute becomes a multi-valued attribute and all of the values are copied over to ”dst’. Only multi-valued attributes are copied over. For single-value attributes, discard the value in ”dst’. Currently only “classification” is a multi-valued attribute.
#define Search_Find(search, attribute-name)
Retrieves the AVPair for the given attribute in the given search. For example, the following statement gets the AVPair for the title attribute in the search s:
SearchAVpair avp=Search_Find(s, "title");
#define Search_Findval(search, attribute-name)
Retrieves the value string for the given attribute in the given search. For example, the following statement prints the value of the title attribute of the search s:
printf("Title = %s\\n", Search_Findval(s, "title"));
#define Search_Remove(search, attribute-name)
Removes the given attribute from the given search.
#define Search_Insert(search, attribute-name, value, value-size)
Inserts the given attribute and the value of the given size as an AVPair into the search.
#define Search_InsertAVP(search, avpair)
Inserts the given AVPair into the given search.
#define Search_Apply(search, function, user-date)
Applies the given function with the given argument (user-data) to each AVPair in the given search. For example:
void print_av(Search *s, SearchAVPair *avp, void *unused) {printf("%s = %s\\n", avp->attribute, avp->value);} /* print every attribute and value in the search s */ Search_Apply(s, print_av, NULL); |
Attribute-value pairs contain an attribute and an associated value. The value often is a simple null-terminated string; however, the value can also be binary data. Attribute-value pairs are stored as SearchAVPair structures.
The important fields in a SearchAVPair structure are:
Attribute string; ”\\0’ terminated
Primary value; may be ”\\0’ terminated
Number of bytes (8 bits) for primary value
Multiple values for multivalued attributes
The sizes for the values
Number of values associated with attribute
Last valid slot - array may contain holes
A Search attribute can have multiple values. Search supports the convention of using -NNN to indicate a multivalued attribute. For example, Title-1, Title-2, Title-3, and so on. The -NNN do not need to be sequential positive integers.
The Search Engine supports searching on multi-valued attributes such as the classification attribute. In Search representation, it is represented using classification-1, classification-2, and so on. For example:
classification-1{5}: robot classification-2{5}: siroe classification-3{10}: web crawler
A SearchStream contains one or more Search objects.
The general approach is that you use Search streams to create and process streams of many Search objects. Given a Search stream, you can parse it to get the Search objects from it. Use the parse() routine to get the next Search object in a Search stream. You can use SearchStream_IsEOS() to check whether the last object has been parsed.
You can use filtering functions for a Search stream to specify that certain Search attributes are allowed or denied. If an attribute is allowed, you can parse and print that attribute for Search objects in the stream. If it is denied, you cannot parse or print that attribute of Search objects in the stream.
Search streams can be disk or memory based.
When you create a SearchStream, you need to specify if you will be printing or parsing the Search stream, and if you will be using a memory- or disk-based stream. The functions you need to use will depend on what you will be doing with the Search stream.
For creating a Search streams into which you will be printing Searches, the functions are:
Creates a disk-based stream ready for printing.
Creates a memory-based stream ready for printing.
Creates a generic application-defined stream ready for printing. The given ”write_fn’ is used to print the stream.
To create Search stream from a file or a string containing Search, use the following functions:
Creates a disk-based stream ready for parsing. The stream is created from an input containing Search syntax.
Creates a memory-based stream ready for parsing. The stream is created from an input containing Search syntax.
SearchStream objects have a caller-data field, which you can use as you like:
void *caller_data; /* hook to be used by caller */
Use SearchStream_Parse() to get the Search objects from the Search stream, and use SearchStream_Print() to write Search objects to the Search stream.
When you’ve finished with the stream, close it by using SearchStream_Finish(). Use SearchStream_SetFinishFn() to trigger the given finish_fn function.
The following example code takes a Search stream in stdin and prints each Search in the stream to stdout. Notice that this code uses Search_ParseInitFile() to create the SearchStream to parse the input file, and uses Search_PrintInitFile() to create the stream to print the Searches to stdout.
SearchStream *searchin = Search_ParseInitFile(stdin); SearchStream *searchout = Search_PrintInitFile(stdout); Search *s; while (!SearchStream_IsEOS(searchin)) { if ((s = SearchStream_Parse(searchin)) { SearchStream_print(searchout, s); Search_Free(s); } } |
Search_PrintInitFile |
NSAPI_PUBLIC SearchStream *S earch_PrintInitFile(FILE *file) Creates a disk-based stream ready for printing. |
Search_PrintInitStr |
NSAPI_PUBLIC SearchStream * Search_PrintInitStr(SearchBuffer *memory) Creates a memory-based stream ready for printing. |
Search_PrintInitFn |
NSAPI_PUBLIC SearchStream *Search_PrintInitFn(int (*write_fn)(void *data,char *buf, int bufsz), void *data) Creates a generic application-defined stream ready for printing. The given write_fn is used to print the stream. This function allows you to hook up your own routine for printing. |
Search_ParseInitFile |
NSAPI_PUBLIC SearchStream *Search_ParseInitFile(FILE *fp) Creates a disk-based stream ready for parsing. The file must contain Search-formatted data. The function reads Search data from the file object fp. |
Search_ParseInitStr |
NSAPI_PUBLIC SearchStream * Search_ParseInitStr(char *buf, int bufsz) Creates a memory-based stream ready for parsing. The character buffer must contain Search-formatted data. |
SearchStream_Finish |
NSAPI_PUBLIC int SearchStream_Finish (SearchStream *) Closes the stream when you have finished with it. |
SearchStream_SetFinishFn |
NSAPI_PUBLIC int SearchStream_SetFinishFn (SearchStream *, int (*finish_fn)(SearchStream *)) Allows you to hook up a function for cleaning up after the Search stream finishes its business. The finish_fn will be called when SearchStream_Finish() has finished executing. |
SearchStream_Print |
#define SearchStream_Print(ss, s) Prints another Search object to the Search stream ss. Returns 0 on success, or non-zero on error. |
SearchStream_Parse |
#define SearchStream_Parse(ss) Parses and returns the next Search object in the Search stream. |
SearchStream_IsEOS |
#define SearchStream_IsEOS(s) Returns 1 (true) if the Search stream has been exhausted. |
SearchStream_IsPrinting |
#define SearchStream_IsPrinting(s) Returns 1 (true) if the Search has been set up in a stream by Search_PrintInitFile() or Search_PrintInitStr(). |
SearchStream_IsParsing |
#define SearchStream_IsParsing(s) Returns 1 (true) if the Search has been setup in a stream by Search_ParseInitFile() or Search_ParseInitStr(). |
To support targeted parsing and printing, you can use the attribute filtering mechanisms in the Search stream. For each Search stream object, you can associate a list of allowed attributes. When printing a Search stream, only the attributes that match the allowed attributes will be printed. When parsing a Search stream, only the attributes that match the allowed attributes will be parsed.
SearchStream_IsAllowed() and SearchStream_SetAllowed() allow attributes, while SearchStream_IsDenied() and SearchStream_SetDenied() deny attributes. You can allow or deny an attribute, but not both.
SearchStream_IsAllowed |
NSAPI_PUBLIC boolean_t SearchStream_IsAllowed (SearchStream *ss, char *attribute); Indicates that the given attribute is allowed (that is, it can be printed or parsed). |
SearchStream_SetAllowed |
NSAPI_PUBLIC int SearchStream_SetAllowed (SearchStream *ss, char *allowed_attrs[]) Sets all the attributes in the allowed_attrs array to allowed. |
SearchStream_SetDenied |
NSAPI_PUBLIC int SearchStream_SetDenied (SearchStream *ss, char *denied_attrs[]); Sets all the attributes in the allowed_attrs array to be denied (that is, they cannot be parsed or printed). |
SearchStream_GetAllowed |
NSAPI_PUBLIC char **SearchStream_GetAllowed (SearchStream *ss) Returns an array of all the attributes that are allowed. |
SearchStream_GetDenied |
NSAPI_PUBLIC char **SearchStream_GetDenied (SearchStream *ss); Returns an array of all the attributes that are denied. |
You can use Search buffers in parsing or printing routines. They take care of memory allocation for inserting and appending. They are basically memory blocks that are easy for Search routines to use.
A Search Buffer is represented in a SearchBuffer structure, that is created with the SearchBuffer_Create() function and freed with the SearchBuffer-Free() function. The SearchBuffer structure provides the append(), increase(), and reset() functions for manipulating the data in the buffer.
SearchBuffer_Create |
NSAPI_PUBLIC SearchBuffer * SearchBuffer_Create(int default_sz); The SearchBuffer is used in Search_PrintInitStr(SearchBuffer *memory). Before you can print Search to memory, you need to create a buffer for output. |
SearchBuffer_Free |
NSAPI_PUBLIC void SearchBuffer_Free (SearchBuffer *sb); Releases the memory buffer created by SearchBuffer_Create(). |
append |
void (*append)(SearchBuffer *sb, char *data, int n) Copies n bytes of data into the buffer. |
increase |
void (*increase)(SearchBuffer *sb, int add_n) Increases the size of the data buffer by add_n bytes. |
reset |
void (*reset)(SearchBuffer *sb) Resets the size of the data buffer and invalidates all currently valid data. A buffer can be reused by resetting it this way. |