Called to give this accessor a reference to the file to be parsed, and, if the accessor can extract text content, to an output file for the text content. This method is called before GetSuggestedCardName, GetDocumentSummary, GetDocumentFields, or WriteTextContent. The accessor may parse the file immediately when this method is called, or store the filename(s) and wait to parse until any of the later methods are called. The later methods may be called in any order, so the best choice of strategy depends on the file format. If the file has a short header containing metadata fields, followed by a long text content section, then parsing each section when the appropriate method (GetDocumentFields or WriteTextContent) is called is the most efficient choice. If, on the other hand, the metadata and content are mixed together throughout the file, then it is probably most efficient to just parse the file in AttachToFile(), write out the text content immediately, and cache the metadata in memory. This avoids parsing a long file with a lot of content twice. Note that you should not read the entire text content into memory at once, since this may be a lot of text. It's fine to cache all metadata fields in memory at once. (Or, another way to say this is that the portal assumes it can keep all document properties except the full-text in memory at once, so if this will be a problem for your document you should reconsider your property strategy.) Implementations should be prepared for this method to be called multiple times. In other words, the same accessor object may be used to parse more than one file, and the other methods below (GetSuggestedCardName, etc.) should reflect the most recent call to AttachToFile(). The input file specified in this call is guaranteed to still exist in any subsequent calls to GetSuggestedCardName, GetDocumentSummary, GetDocumentFields, or WriteTextContent (until another call to AttachToFile). The output file may not exist yet when AttachToFile() is called; the accessor should be prepared to create it. Its directory should exist, though (accessors may throw an XPFileNotFoundException if this is not the case). The accessor is allowed to write to the output file any time between the call to AttachToFile() and when WriteTextContent() returns, but NOT afterwards. Once WriteTextContent() returns, Plumtree's portal code takes ownership of the output file and the accessor should not access or delete it. It is strongly suggested that you write the output text content file in UTF-8, UTF-16, or UCS-4 encodings, with a byte order mark (BOM) if desired for UTF-16 or UCS-4. (If the BOM is omitted, the byte order will default to the machine endianness.) Some other encodings may be supported, depending on the text language. Contact Plumtree for more information on other currently supported encodings. If AttachToFile() throws an exception, the portal will assume that the input file was incomprehensible, and will not call any other methods (except AttachToFile() on the next input file).
IPTCustomFileAccessor Interface | com.plumtree.server Namespace