com.plumtree.server
Interface IPTCustomFileAccessor


public interface IPTCustomFileAccessor


Method Summary
 void AttachToFile(com.plumtree.openfoundation.io.XPFile inputFile, com.plumtree.openfoundation.io.XPFile contentOutputFile)
          Called to give this accessor a reference to the file to be parsed, and, if the accessor can extract text content, to an output file for the text content.
 boolean CanExtractTextContent()
          Should return true if the accessor can generally extract the full text content of a document.
 boolean CanSuggestCardName()
          Should return true if the accessor can generally suggest a card name given the file's contents.
 boolean CanSummarizeDocument()
          Should return true if the accessor can generally make a short summary of a file's contents.
 IPTCustomFileAccessor Create()
          Generate a brand-new instance of this accessor class.
 java.lang.String GetAccessorDescription()
          Should return a human-readable description of the accessor, which might describe the types of files it can parse, the name of the accessor implementer, etc.
 java.lang.String GetAccessorGUID()
          Should return a unique GUID for this accessor class that distinguishes it from any other object in the system.
 java.lang.String GetAccessorIconGUID()
          Should return a unique GUID for this accessor's file type icon that distinguishes it from any other icon.
 java.lang.String GetAccessorName()
          Should return a human-readable name for the accessor.
 com.plumtree.openfoundation.util.IXPDictionary GetDocumentFields()
          Return all extracted metadata fields associated with the file (except for the suggested card name, summary, and full text content, which use the specialized methods shown above).
 java.lang.String GetDocumentSummary()
          Should return a summary of the content of the file being accessed.
 java.lang.String GetSuggestedCardName()
          Should return a suggested name for the card, based on the file being accessed.
 void WriteTextContent()
          Complete writing the extracted text content to the contentOutputFile specified in a preceding call to AttachToFile().
 

Method Detail

Create

IPTCustomFileAccessor Create()
Generate a brand-new instance of this accessor class. Usually just one line: return new MyAccessor(); This method is more or less required by Dynamic Discovery. Dynamic Discovery returns a single instance of the class, from which other instances must be generated. The portal UI works the same way.


GetAccessorName

java.lang.String GetAccessorName()
Should return a human-readable name for the accessor. This name is never localized; the returned String is used in all locales.

Returns:
a non-null non-empty human-readable name for the accessor

GetAccessorDescription

java.lang.String GetAccessorDescription()
Should return a human-readable description of the accessor, which might describe the types of files it can parse, the name of the accessor implementer, etc. This string is never localized. Note: This is not currently used in the portal UI, but may be in the future.

Returns:
a human-readable description of the accessor

GetAccessorGUID

java.lang.String GetAccessorGUID()
Should return a unique GUID for this accessor class that distinguishes it from any other object in the system. GUIDs should be generated by some automated tool, such as Microsoft's uuidgen or by visiting the website www.guid.org and clicking "see your GUID". If necessary, add braces and hyphens and convert case so your GUID is of the form "{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}", where each x is a hex digit (uppercase).

Returns:
unique GUID for this accessor

GetAccessorIconGUID

java.lang.String GetAccessorIconGUID()
Should return a unique GUID for this accessor's file type icon that distinguishes it from any other icon. This should not be the same as the accessor GUID. See the comments for GetAccessorGUID() for instructions on how to generate the GUID. The icon's 20x20 GIF file should be placed under your imageserver installation, at plumtree/portal/public/img/sml{iconGUID}.gif.

Returns:
unique GUID for the accessor icon

CanSuggestCardName

boolean CanSuggestCardName()
Should return true if the accessor can generally suggest a card name given the file's contents. If this method returns true, GetSuggestedCardName() will later be called to retrieve a name to be used for the card. If it returns false, GetSuggestedCardName() will not be called and a less descriptive name will be generated through other means.

Returns:
whether the accessor can generally suggest a card name

CanSummarizeDocument

boolean CanSummarizeDocument()
Should return true if the accessor can generally make a short summary of a file's contents. If this method returns true, GetDocumentSummary() will later be called to retrieve a summary of each document's contents. If it returns false, GetDocumentSummary() will not be called.

Returns:
whether the accessor can generally summarize a card

CanExtractTextContent

boolean CanExtractTextContent()
Should return true if the accessor can generally extract the full text content of a document. If this method returns true, the portal will pass a non-null output file reference to AttachToFile() and will expect the file's text content to be written to that file. If it returns false, AttachToFile()'s contentOutputFile argument will be null.

Returns:
whether the accessor can extract text content

AttachToFile

void AttachToFile(com.plumtree.openfoundation.io.XPFile inputFile,
                  com.plumtree.openfoundation.io.XPFile contentOutputFile)
Called to give this accessor a reference to the file to be parsed, and, if the accessor can extract text content, to an output file for the text content. This method is called before GetSuggestedCardName, GetDocumentSummary, GetDocumentFields, or WriteTextContent. The accessor may parse the file immediately when this method is called, or store the filename(s) and wait to parse until any of the later methods are called. The later methods may be called in any order, so the best choice of strategy depends on the file format. If the file has a short header containing metadata fields, followed by a long text content section, then parsing each section when the appropriate method (GetDocumentFields or WriteTextContent) is called is the most efficient choice. If, on the other hand, the metadata and content are mixed together throughout the file, then it is probably most efficient to just parse the file in AttachToFile(), write out the text content immediately, and cache the metadata in memory. This avoids parsing a long file with a lot of content twice. Note that you should not read the entire text content into memory at once, since this may be a lot of text. It's fine to cache all metadata fields in memory at once. (Or, another way to say this is that the portal assumes it can keep all document properties except the full-text in memory at once, so if this will be a problem for your document you should reconsider your property strategy.) Implementations should be prepared for this method to be called multiple times. In other words, the same accessor object may be used to parse more than one file, and the other methods below (GetSuggestedCardName, etc.) should reflect the most recent call to AttachToFile(). The input file specified in this call is guaranteed to still exist in any subsequent calls to GetSuggestedCardName, GetDocumentSummary, GetDocumentFields, or WriteTextContent (until another call to AttachToFile). The output file may not exist yet when AttachToFile() is called; the accessor should be prepared to create it. Its directory should exist, though (accessors may throw an XPFileNotFoundException if this is not the case). The accessor is allowed to write to the output file any time between the call to AttachToFile() and when WriteTextContent() returns, but NOT afterwards. Once WriteTextContent() returns, Plumtree's portal code takes ownership of the output file and the accessor should not access or delete it. It is strongly suggested that you write the output text content file in UTF-8, UTF-16, or UCS-4 encodings, with a byte order mark (BOM) if desired for UTF-16 or UCS-4. (If the BOM is omitted, the byte order will default to the machine endianness.) Some other encodings may be supported, depending on the text language. Contact Plumtree for more information on other currently supported encodings. If AttachToFile() throws an exception, the portal will assume that the input file was incomprehensible, and will not call any other methods (except AttachToFile() on the next input file).

Parameters:
inputFile - The file to parse. Ordinarily, this file should exist, but implementations should throw an XPFileNotFoundException if for some reason it does not.
contentOutputFile - If CanExtractTextContent() returns true, this will be a non-null reference to a file where the accessor should write the text content of the file as plain text (not including any non-content metadata fields).

GetSuggestedCardName

java.lang.String GetSuggestedCardName()
Should return a suggested name for the card, based on the file being accessed. If CanSuggestCardName() returns false, this method should throw an exception (although the portal should not call it). This method may throw an exception if a card name could not be suggested, but other methods below may still be called. If the file is completely unparsable, then throw an exception in AttachToFile to prevent this method from being called, or throw separate exceptions in the other methods.


GetDocumentSummary

java.lang.String GetDocumentSummary()
Should return a summary of the content of the file being accessed. (Called by IPTAccessor.DocumentSummary().) If CanSummarizeDocument() returns false, this method should throw an exception (although the portal should not call it). This method may throw an exception if a summary could not be suggested, but other Get.. methods may still be called. If the file is completely unparsable, then throw an exception in AttachToFile to prevent this method from being called, or throw separate exceptions in the other methods.


GetDocumentFields

com.plumtree.openfoundation.util.IXPDictionary GetDocumentFields()
Return all extracted metadata fields associated with the file (except for the suggested card name, summary, and full text content, which use the specialized methods shown above). The keys of the returned IXPDictionary should be the non-null String names of metadata fields. The values' types depend on the types of the properties whose values will eventually be derived from the metadata field: PT_PROPTYPE_STRING: any Object with a valid toString() method, but usually String is the best choice PT_PROPTYPE_LONG: can be any Integer type (although long will be cast to int), any floating-point type (also cast to int), or any String containing an int parsable by Integer.parseInt() PT_PROPTYPE_DOUBLE: can be any floating-point type within 32-bit Float range, any integer type (castable to float), or any String containing a value parsable by Float.parseFloat() PT_PROPTYPE_DATE: must be XPDateTime, no other type allowed. This method is called from IPTAccessor.GetFields(). This method may throw an exception if a summary could not be suggested, but other Get.. methods may still be called. If the file is completely unparsable, then throw an exception in AttachToFile to prevent this method from being called, or throw separate exceptions in the other methods.

Returns:
IXPDictionary consisting of a mapping from field name (a String) to non-null field value of one of the types listed above(which may be String, Integer, Double, or XPDateTime).

WriteTextContent

void WriteTextContent()
Complete writing the extracted text content to the contentOutputFile specified in a preceding call to AttachToFile(). If the accessor has already written the content during a preceding call (in AttachToFile() or any call since then), this method may be a no-op.