com.plumtree.server
Interface IPTCustomFileAccessor
public interface IPTCustomFileAccessor
Method Summary |
void |
AttachToFile(com.plumtree.openfoundation.io.XPFile inputFile,
com.plumtree.openfoundation.io.XPFile contentOutputFile)
Called to give this accessor a reference to the file to be parsed,
and, if the accessor can extract text content, to an output file for
the text content. |
boolean |
CanExtractTextContent()
Should return true if the accessor can generally extract the full
text content of a document. |
boolean |
CanSuggestCardName()
Should return true if the accessor can generally suggest a card name
given the file's contents. |
boolean |
CanSummarizeDocument()
Should return true if the accessor can generally make a short summary
of a file's contents. |
IPTCustomFileAccessor |
Create()
Generate a brand-new instance of this accessor class. |
java.lang.String |
GetAccessorDescription()
Should return a human-readable description of the accessor, which
might describe the types of files it can parse, the name of the
accessor implementer, etc. |
java.lang.String |
GetAccessorGUID()
Should return a unique GUID for this accessor class that distinguishes
it from any other object in the system. |
java.lang.String |
GetAccessorIconGUID()
Should return a unique GUID for this accessor's file type icon that
distinguishes it from any other icon. |
java.lang.String |
GetAccessorName()
Should return a human-readable name for the accessor. |
com.plumtree.openfoundation.util.IXPDictionary |
GetDocumentFields()
Return all extracted metadata fields associated with the file (except for the
suggested card name, summary, and full text content, which use the specialized
methods shown above). |
java.lang.String |
GetDocumentSummary()
Should return a summary of the content of the file being accessed. |
java.lang.String |
GetSuggestedCardName()
Should return a suggested name for the card, based on the file being accessed. |
void |
WriteTextContent()
Complete writing the extracted text content to the contentOutputFile specified
in a preceding call to AttachToFile(). |
Create
IPTCustomFileAccessor Create()
- Generate a brand-new instance of this accessor class. Usually just one
line:
return new MyAccessor();
This method is more or less required by Dynamic Discovery.
Dynamic Discovery returns a single instance of the class, from which
other instances must be generated. The portal UI works the same way.
GetAccessorName
java.lang.String GetAccessorName()
- Should return a human-readable name for the accessor. This name
is never localized; the returned String is used in all locales.
- Returns:
- a non-null non-empty human-readable name for the accessor
GetAccessorDescription
java.lang.String GetAccessorDescription()
- Should return a human-readable description of the accessor, which
might describe the types of files it can parse, the name of the
accessor implementer, etc. This string is never localized.
Note: This is not currently used in the portal UI, but may be in
the future.
- Returns:
- a human-readable description of the accessor
GetAccessorGUID
java.lang.String GetAccessorGUID()
- Should return a unique GUID for this accessor class that distinguishes
it from any other object in the system. GUIDs should be generated
by some automated tool, such as Microsoft's uuidgen or by visiting
the website www.guid.org and clicking "see your GUID". If necessary,
add braces and hyphens and convert case so your GUID is of the form
"{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}", where each x is a hex digit
(uppercase).
- Returns:
- unique GUID for this accessor
GetAccessorIconGUID
java.lang.String GetAccessorIconGUID()
- Should return a unique GUID for this accessor's file type icon that
distinguishes it from any other icon. This should not be the same as
the accessor GUID.
See the comments for GetAccessorGUID() for instructions on how to
generate the GUID. The icon's 20x20 GIF file should be placed under your
imageserver installation, at plumtree/portal/public/img/sml{iconGUID}.gif.
- Returns:
- unique GUID for the accessor icon
CanSuggestCardName
boolean CanSuggestCardName()
- Should return true if the accessor can generally suggest a card name
given the file's contents.
If this method returns true, GetSuggestedCardName() will later be
called to retrieve a name to be used for the card. If it returns false,
GetSuggestedCardName() will not be called and a less descriptive name
will be generated through other means.
- Returns:
- whether the accessor can generally suggest a card name
CanSummarizeDocument
boolean CanSummarizeDocument()
- Should return true if the accessor can generally make a short summary
of a file's contents.
If this method returns true, GetDocumentSummary() will later be
called to retrieve a summary of each document's contents. If it
returns false, GetDocumentSummary() will not be called.
- Returns:
- whether the accessor can generally summarize a card
CanExtractTextContent
boolean CanExtractTextContent()
- Should return true if the accessor can generally extract the full
text content of a document.
If this method returns true, the portal will pass a non-null output
file reference to AttachToFile() and will expect the file's text
content to be written to that file. If it returns false, AttachToFile()'s
contentOutputFile argument will be null.
- Returns:
- whether the accessor can extract text content
AttachToFile
void AttachToFile(com.plumtree.openfoundation.io.XPFile inputFile,
com.plumtree.openfoundation.io.XPFile contentOutputFile)
- Called to give this accessor a reference to the file to be parsed,
and, if the accessor can extract text content, to an output file for
the text content. This method is called before GetSuggestedCardName,
GetDocumentSummary, GetDocumentFields, or WriteTextContent.
The accessor may parse the file immediately when this method is called,
or store the filename(s) and wait to parse until any of the later
methods are called. The later methods may be called in any order, so
the best choice of strategy depends on the file format. If the file has
a short header containing metadata fields, followed by a long text content section,
then parsing each section when the appropriate method (GetDocumentFields
or WriteTextContent) is called is the most efficient choice. If, on the
other hand, the metadata and content are mixed together throughout the file,
then it is probably most efficient to just parse the file in AttachToFile(),
write out the text content immediately, and cache the metadata in memory.
This avoids parsing a long file with a lot of content twice.
Note that you should not read the entire text content into memory at once,
since this may be a lot of text. It's fine to cache all metadata fields
in memory at once. (Or, another way to say this is that the portal assumes
it can keep all document properties except the full-text in memory at
once, so if this will be a problem for your document you should reconsider
your property strategy.)
Implementations should be prepared for this method to be called multiple
times. In other words, the same accessor object may be used to parse
more than one file, and the other methods below (GetSuggestedCardName, etc.)
should reflect the most recent call to AttachToFile().
The input file specified in this call is guaranteed to still exist
in any subsequent calls to GetSuggestedCardName, GetDocumentSummary,
GetDocumentFields, or WriteTextContent (until another call to AttachToFile).
The output file may not exist yet when AttachToFile() is called; the accessor
should be prepared to create it. Its directory should exist, though
(accessors may throw an XPFileNotFoundException if this is not the case).
The accessor is allowed to write to the output file any time between the
call to AttachToFile() and when WriteTextContent() returns, but NOT
afterwards. Once WriteTextContent() returns, Plumtree's portal code
takes ownership of the output file and the accessor should not access or
delete it.
It is strongly suggested that you write the output text content file in
UTF-8, UTF-16, or UCS-4 encodings, with a byte order mark (BOM) if desired
for UTF-16 or UCS-4. (If the BOM is omitted, the byte order will default
to the machine endianness.) Some other encodings may be supported,
depending on the text language. Contact Plumtree for more information on
other currently supported encodings.
If AttachToFile() throws an exception, the portal will assume that the
input file was incomprehensible, and will not call any other methods
(except AttachToFile() on the next input file).
- Parameters:
inputFile
- The file to parse. Ordinarily, this file should exist,
but implementations should throw an XPFileNotFoundException if for some
reason it does not.contentOutputFile
- If CanExtractTextContent() returns true, this will be
a non-null reference to a file where the accessor should write the text
content of the file as plain text (not including any non-content metadata
fields).
GetSuggestedCardName
java.lang.String GetSuggestedCardName()
- Should return a suggested name for the card, based on the file being accessed.
If CanSuggestCardName() returns false, this method should throw an exception
(although the portal should not call it).
This method may throw an exception if a card name could not be suggested,
but other methods below may still be called. If the file is completely
unparsable, then throw an exception in AttachToFile to prevent this
method from being called, or throw separate exceptions in the other
methods.
GetDocumentSummary
java.lang.String GetDocumentSummary()
- Should return a summary of the content of the file being accessed.
(Called by IPTAccessor.DocumentSummary().)
If CanSummarizeDocument() returns false, this method should throw an
exception (although the portal should not call it).
This method may throw an exception if a summary could not be suggested,
but other Get.. methods may still be called. If the file is completely
unparsable, then throw an exception in AttachToFile to prevent this
method from being called, or throw separate exceptions in the other
methods.
GetDocumentFields
com.plumtree.openfoundation.util.IXPDictionary GetDocumentFields()
- Return all extracted metadata fields associated with the file (except for the
suggested card name, summary, and full text content, which use the specialized
methods shown above). The keys of the returned IXPDictionary should be the
non-null String names of metadata fields. The values' types depend on the
types of the properties whose values will eventually be derived from the
metadata field:
PT_PROPTYPE_STRING: any Object with a valid toString() method, but usually
String is the best choice
PT_PROPTYPE_LONG: can be any Integer type (although long will be cast to int),
any floating-point type (also cast to int), or any String
containing an int parsable by Integer.parseInt()
PT_PROPTYPE_DOUBLE: can be any floating-point type within 32-bit Float range,
any integer type (castable to float), or any String
containing a value parsable by Float.parseFloat()
PT_PROPTYPE_DATE: must be XPDateTime, no other type allowed.
This method is called from IPTAccessor.GetFields().
This method may throw an exception if a summary could not be suggested,
but other Get.. methods may still be called. If the file is completely
unparsable, then throw an exception in AttachToFile to prevent this
method from being called, or throw separate exceptions in the other
methods.
- Returns:
- IXPDictionary consisting of a mapping from field name (a String) to
non-null field value of one of the types listed above(which may be String, Integer, Double, or XPDateTime).
WriteTextContent
void WriteTextContent()
- Complete writing the extracted text content to the contentOutputFile specified
in a preceding call to AttachToFile(). If the accessor has already written the
content during a preceding call (in AttachToFile() or any call since then), this
method may be a no-op.