PDF Page Level Controls
This feature is for the ability to add internal indexes into the Documaker created PDF output. The actual purpose of the metadata indexes is left to the customer to decide. The addition of this information does not impact the visual display of the PDF to an end-user. Such data may be used to facilitate storage systems, print vendors, or any other functionality that can scan the information out of the PDF.
Per the PDF document architecture specification, these internal indexes must be created within the Page-Piece Dictionary (PPD). A PPD can be used to hold private application data. Similar to TLE or NoOp records, the PPDs are not part of the viewable page, but exist within the document itself, and can be extracted for use by downstream processes. Each page of a PDF may contain a page-piece dictionary, but does not have to.
Applications can use this dictionary as a place to store private data in connection with that document, page, or form. Such private data can convey information meaningful to the application that produces it but is typically ignored by general-purpose PDF viewer applications.
The general structure of the PPD follows a nested hierarchy. In the example below, 'IBM-ODIndexes' and 'Private' are dictionaries under the 'PieceInfo' Page-Piece Dictionary. The 'IBM-ODIndexes' dictionary is an example of a custom dictionary name within the PPD and is specified via an INI setting. The 'Private' dictionary is required by the PDF architecture within the custom dictionary to contain the individual custom metadata elements (DistributionMethod, RecipientName, etc.) The LastModified element is a date stamp required by the PDF architecture and will be automatically generated by the PDF driver. The example below is how a PPD within a PDF file might appear:
- /PieceInfo <<
-
- /IBM-ODIndexes <<
-
- /Private <<
-
- /DistributionMethod(Print)
- /RecipientName(Joe Doe)
- /NoticeType(W)
- /StmtDate(20120507)
- >>
- /LastModified(D:20200514000000Z)
- >>
- >>
- /LastModified(D:20200514000000Z)
There are three new INI settings and a corresponding DAL script required to implement adding a PieceInfo Dictionary to one or more pages within a PDF file.
New INI settings in the PDF printer group (e.g. PrtType:PDF) will be used to activate the PPD logic by defining a the dictionary name, script, and data separator character.
For example,
- < PrtType:PDF >
- PieceInfoDictionary = YourDictionaryName
- PieceInfoScript = YourScriptName.DAL
- PieceInfoSeparator = :
PieceInfoDictionary= is a new INI setting in the PDF printer group (e.g. PrtType:PDF) used to specify the custom dictionary name within the 'PieceInfo' PPD. In the earlier example where IBM-ODIndexes is the dictionary, this setting would be:
- PieceInfoDictionary= IBM-ODIndexes
Documaker recommends using 7-bit ASCII characters (preferably only letters and numbers) for dictionary names. Obviously, the example above also uses the hyphen. Check the requirements of any vendor/product expecting to locate the dictionary and follow their recommendation.
PieceInfoScript= is a new INI option used to define the DAL script that uses the AddComment (or the new AddUTF8Comment) DAL function to add one or more strings that will be used in the PPD Private dictionary entries for that page.
PieceInfoSeparator= is a new INI setting used to specify the character that separates the key name and value portions of the comment string. Documaker recommends using punctuation characters (colon (:), semicolon (;), comma (,), etc.) for the separator character. Don't use a character that would be expected to appear in your data.
The AddComment (DAL) functionality is used to add dynamic “comment” content into certain print output, which can later be interpreted by subsequent systems (like an archiver) to further process the output.
The PDF driver expects comment strings with a delimiter character that separates the dictionary entry key names and values similar to the TLEScript functionality for AFP.
The second parameter of AddComment determines how the first parameter string will be converted.
- -1 - convert the string to UTF-8 (new value for release 12.7.1)
- 0 - (zero) convert the string to EBCDIC (default value if no second parameter is used)
- 1 - convert the string to ASCII (Obsolete, applied to mainframe platforms)
- 2 - do not convert the string
The default is zero (0) because AddComment was originally added for AFP which normally uses EBCDIC.
Note: IBM's OnDemand requires UTF-8 data in the PDF Page-Piece (PieceInfo) dictionary. If the target for the PPD information is OnDemand, use -1 as the second parameter in your AddComment functions to add your metadata entries.
AddUTF8Comment is a new DAL function that assumes the string parameter should be converted to UTF-8 if needed.
Both the AddComment and AddUTF8Comment DAL functions add strings to the same comment list, but the latter assumes the destination data type should be UTF8.
Assuming the INI setting “PieceInfoSeparator = :”, examples of how the functions may be called in the DAL script:
- AddComment(“RecipientName:Joe Doe”, -1)
- or
- AddUTF8Comment(“RecipientName:Joe Doe”)
Either of the above indicates the desire to add a dictionary entry of “RecipientName” and a value of “Joe Doe” using UTF-8 encoding.
The DAL script can call AddComment multiple times to add multiple entries into the PieceInfo dictionary for a given page. To skip adding a PieceInfo dictionary for a given page, the script should avoid calling AddComment for that page. The DAL script can use various DAL functions, GVM variables, etc. to access MRL information in order to gather the data used for the AddComment additions.
LIMITATIONS / RECOMMENDATIONS
- Both the custom dictionary name and the key name portion of the comment string (i.e. text before delimiter character) cannot contain the following characters:
-
- ( ) / < >
- left parenthesis '(', right parenthesis ')', forward slash '/', less than sign '<', greater than sign '>'
- The maximum dictionary name and key name lengths are 127 bytes.
If the dictionary name violates either of these rules, the PieceInfo dictionary will not be created.
If the key name portion of a comment violates these rules, then the key name/value string from the comment will not be added to the PieceInfo dictionary.
Documaker recommends using 7-bit ASCII characters (preferably only letters and numbers) for both the dictionary name and the key name portion of the comment strings. Documaker recommends using punctuation characters (colon (:), semicolon (;), comma (,), etc.) for the separator character.
Documaker advises using unique key names but will not prevent using the same key name multiple times in a PieceInfo dictionary.
If the value portion of the comment string contains binary data, it should be encoded within the comment string using a 2-digit hexadecimal code in the format of #hh.
The Dictionary name and key names may be case sensitive.
If multiple pages have a PDF PieceInfo dictionary, each dictionary may be required to have the same number of entries.
3rd parties (PDF readers, print vendors, archive systems, etc.) may impose other limitations on the data used in PDF PieceInfo dictionaries.
Verifying the PieceInfo dictionary(s) in a PDF presents some difficulties without tools capable of displaying the information. The PPD is considered an internal PDF structure and not generally visible in most standard PDF viewers. While it may be possible to do a binary scan and find /PieceInfo entries, it would be very difficult to determine which page each dictionary belongs to without appropriate tools. Therefore, you may have to acquire additional tools or work with your vendor to determine accuracy of generation.