The unmarshal and marshal methods of a COBOL Copybook OTD (with the exception of the marshalToString and unmarshalFromString) have been reimplemented to heed the OTD structure’s data type information. When data flows into or out of the OTD, character set encoding is applied only to the portions of the data that fall on or draw from OTD fields corresponding to items in the Copybook specification that store character data (i.e., usage display items, whether implicitly or explicitly specified). Data for other types of OTD fields are not subject to charset encoding, since these fields are capable of containing binary (non-character) data.
An ambiguity arises when an OTD field, corresponding to a usage display item, is also the object of redefinition(s) in the Copybook. Redefined items may have alternate, multiple storage types, and to deal with such an item, the OTD must decide which one of the multiple definition is in effect at the time of unmarshaling or marshaling, in relation to the available data. The current implementation of COBOL Copybook OTDs resolve this ambiguity by ignoring redefinitions. The decision whether or not to apply encoding to a field is based solely on the item’s original storage specification in the Copybook.
COBOL Copybook OTDs do not support any particular Double Byte Character Set (DBCS) encoding. When inserted into DBCS nodes, it will not perform inspections of data to determine what specific DBCS encoding is used by character codes or byte sequences (e.g., discerning between a double-byte and a multi-byte encoding). As a consequence:
DBCS items are represented in the OTD by Java byte array nodes, and their content will be treated as binary "blobs" with the following rules:
If content is set directly to a DBCS node, it is stored as-is.
If the content is retrieved directly from the DBCS node, the content that was originally set is also returned as-is.
If content is unmarshaled via the OTD root, the portion corresponding to the DBCS node is stored as-is. It should be noted however, that correctness of the aggregate input is the responsibility of the root-level unmarshal call (e.g., do not use unmarshalFromString if the OTD contains DBCS items).
If the OTD’s content is marshaled, the portion corresponding to the DBCS node is yielded as-is, and is excluded from any character set transcoding that character data nodes of the OTD may be subjected to.
Copybook OTDs will not auto-truncate DBCS data. Since the OTD cannot know the specific DBCS encoding of the data, it cannot correctly truncate it at the correct character boundaries. If the content which is set directly to a DBCS node exceeds the item’s width, the OTD will raise an exception.