XLIFF, or XML Localization Interchange File Format, is an XML-based format that was developed specifically for use in translation projects. As its name and origin suggest, XLIFF was designed to facilitate the exchange of information across different systems during document localization. In general, source text of an arbitrary file format is converted into XLIFF by a filter that depends on the source structure. This filter extracts the translatable content as plain, unformatted text, allowing the translator to concentrate on the meaning of the text to be translated, regardless of the original source file format, structure, or appearance. Such additional information about the source is still preserved in the XLIFF file in the form of tags, which are used to apply the appropriate processing to the translated text so that the original structure and formatting are reproduced upon conversion of the translated text back to the original format. 


It is assumed that, if you choose to localize your Developer content using XLIFF, you are already familiar with basic XLIFF concepts and are working with an XLIFF tool or sending your content to a localization vendor who does. Therefore, the discussion here provides only a description of the content of Developer-generated XLIFF files and the subset of XLIFF applicable under these conditions. For further information on XLIFF in general, see the web site of the OASIS Technical Committee on XLIFF at http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xliff.


Developer XLIFF Processing

When you execute the Export Localization command, the Developer applies a filter to the selected content, extracting all translatable text and rendering it in a form that complies with XLIFF specifications. The text extracted depends on the type of document selected, as discussed later in this section. Within the XLIFF file, the text is organized by document and then by translation unit within each document. Therefore, as you process the XLIFF file, you proceed sequentially through each of the exported documents.


Warning! During the Localization export process, the Developer applies a unique ID to each translation unit that includes information regarding the document ID, document type, and text type (if the document can have multiple translation units; see below). Therefore, you should neither merge nor split translation units during processing in XLIFF, as this could lead to problems upon import. Specifically, if the file selected for import contains translation units with invalid IDs or other problems, the Developer will skip such translation units, thus not importing the corresponding text.


For module, section, assessment, user-defined questions and package documents, the only text exported for translation is the document name. Each of these document types corresponds to a single translation unit containing just the unformatted document name.

 

For glossary documents, in addition to the document name, each term and each definition link tooltip is also exported as a separate translation unit. Therefore, the number of translation units created for a glossary document equals the number of terms plus the number of tooltips plus 1. These translation units also consist of unformatted text.

 

For question documents, the text that is exported depends on the question type. All question types have a document name. All question types, with exception of User-Defined, contain Remediation Text and Question Text that are each exported as translation units. Multiple Choice (Many Answers and Single Answer) and Matching questions have Option text for each answer option. Each Option text is exported as a translation unit. Fill In questions have Blank Answer text that is exported as a translation unit. All question translation units consist of unformatted text.


Each web page document is exported as at least two translation units, one for the document name and one for the body of the web page, regardless of the amount of text it contains. As for the other documents, the translation unit for the document name contains only unformatted text. However, unlike the preceding document types, web pages can include such elements as formatted text, images, and hyperlinks. Although such information is not relevant to the translation of the text itself, it is important for ensuring that the translated document has the same structure and appearance as the source document. Therefore, the XLIFF generated by the Developer includes such information within web page translation units using the standard XLIFF processing tags <bpt> (begin paired tag) and <ept> (end paired tag). These tags surround appropriate pieces of text in the source translation unit and contain XHTML code (generated by the Web Page Editor) corresponding to the formatting and/or processing needed.


For Developer web pages, the XLIFF processing tags can contain three basic types of XHTML code: text formatting, paragraph formatting, and image and hyperlink properties. Tags for text formatting that are supported by the Web Page Editor include <font>, <strong>, <em>, and <u>, for font properties (such as font family/style, size, and color), bold, italic, and underline, respectively. Paragraph formatting tags supported by the Web Page Editor include <p> (with the optional align attribute), <br>, <ul>, <ol>, and <li> for paragraph return (including paragraph alignment, if necessary), line break, unordered (bulleted) list, ordered (numbered) list, and list item, respectively. Both text and paragraph formatting tags should be copied, unchanged, from the source text to the corresponding portion of the translated text. This ensures that the appearance of the translated text matches that of the equivalent source text.


For images (<img> tag) in web pages, the processing tags always include the image source attribute (src) and might also include attributes representing the properties that can be set using the Image Properties dialog box, namely, alternative text (alt), size (style), border (border), and alignment (align). Likewise, for hyperlinks (<a>, or anchor, tag) in web pages, the processing tags always include the hyperlink target attribute (href) and might also include an attribute representing the tooltip (title). In general, image and hyperlink processing tags should also be copied from the source text to the corresponding portion of the translated text. However, you might want to translate the alternative text for an image or the tooltip for a hyperlink. You can do so by directly editing the corresponding <bpt> processing tag to include the appropriate translation in the alt or title attribute, respectively.


Note that, except for URL images or URL hyperlinks, you should not need to edit the image source (img src) or hyperlink target (a href) attributes in web page translation units. Rather, these values are updated when you create a duplicate of your content (including related documents) before exporting it for localization. If you need to change these attributes further, you should edit the image properties or edit the hyperlink properties from the Web Page Editor after importing the translated web page.


Note: Hyperlinks are included in translation units only for manually created links, not for glossary links. Rather, you should update glossary links from the Developer after importing the translated content and glossary (or glossaries).


Topics can consist of a large number of translation units, and the translation units corresponding to bubble text can include both formatting and markup for play modes and outputs. Specifically, each of the following components of a topic, if present, corresponds to an individual translation unit:

Except for custom bubble text, all of the other topic translation units contain unformatted text.


Custom bubble text can contain both text and paragraph formatting, as well as bubble text links. As for web pages, this information is included within bubble text translation units using the standard XLIFF processing tags <bpt> (begin paired tag) and <ept> (end paired tag). These tags surround appropriate pieces of text in the source translation unit and contain XML code (generated by the Topic Editor) corresponding to the formatting needed.


For topics, the XLIFF processing tags can contain three basic types of XML code: text formatting, paragraph formatting, and hyperlink properties. All text formatting supported by the Topic Editor is specified using the <fmt> (format) tag, including the attributes font, sty, clr, and unbr, for font family and size, style (with possible values b for bold, i for italic, and u for underline), color (specified using hexadecimal color codes), and nonbreaking text. The only paragraph formatting tag supported by the Topic Editor is <p> with the optional align attribute, for paragraph return including paragraph alignment, if necessary. As for web pages, both text and paragraph formatting tags in custom bubble text should be copied, unchanged, from the source to the corresponding portion of the translated text. This ensures that the appearance of the translated text matches that of the equivalent source text.


For hyperlinks (<a>, or anchor, tag) in bubble text, the processing tags include only the hyperlink target attribute (href). In general, these hyperlink processing tags should also be copied, unchanged, from the source text to the corresponding portion of the translated text.


Note that, except for URL hyperlinks, you should not need to edit the hyperlink target (a href) attributes in bubble text translation units. Rather, these values are updated when you create a duplicate of your content (including related documents) before exporting it for localization. If you need to change these attributes further, you should update the bubble text link from the Topic Editor after importing the translated topic.


Note: Hyperlinks are included in translation units only for manually created links, not for glossary links. Rather, you should update glossary links from the Developer after importing the translated content and glossary (or glossaries).


In addition to formatting and hyperlinks, custom bubble text can also include markup for play mode (visible in See It/Try It, Do It, and/or Know It modes) and output (visible in Player and/or print outputs). If all of the custom bubble text in a topic frame is marked for the same play modes and outputs, it is exported as a single translation unit. However, if the same bubble contains custom text marked for different modes and/or outputs, multiple translation units are generated, one for each unique set of play modes and outputs.


For example, consider the bubble containing the text "This text is valid in all modes. All of the text is valid in both outputs. This text is valid only in Know It? mode." In the Topic Editor, the first two sentences are marked to appear in all play modes, whereas the third sentence is marked to appear only in Know It mode; all of the text is marked for both Player and print outputs. When this topic is exported for localization, two translation units are created for this bubble: One contains the text "This text is valid in all modes. All of the text is valid in both outputs." The second contains the text "This text is valid in all modes. All of the text is valid in both outputs. This text is valid only in Know It? mode." The IDs of the two translation units indicate the difference in markup in the source, with the first being specified for See It/Try It and Do It modes and both outputs and the second being specified for Know It mode only and both outputs. If the output markup of any of the bubble text had also differed (for example, if the second sentence had been marked for print only), these translation units would have been divided further to reflect these differences.


Despite the obvious overlap in content, these translation units should be translated and maintained as two separate units because they represent different combinations of mode and/or output markup. Upon import, these translation units will be written to the same frame bubble, with the first marked to appear only in See It/Try It and Do It modes and the second marked to appear only in Know It mode. That is, where the original bubble text consisted of three sentences, the translated bubble text will appear as five sentences overall, one set of two sentences from the first translation unit and another set of three sentences from the second translation unit. Although this approach leads to some duplicate translation and repetition in the bubble of text appearing in different markup combinations, it ensures that the play mode and output markup of the source text is accurately reproduced in the translated bubble text, so that the translated topic publishes correctly. In addition, this approach simplifies the translation process. That is, translators are frequently comfortable with formatting attributes such as bold or font name, but might find it more difficult to interpret and appropriately process the mode and output markup, which are specific to the Developer. Moreover, use of a translation memory can minimize the impact of any duplication.


Summary

In summary, the translation of custom Developer text using XLIFF is relatively straightforward, with most translation units consisting simply of unformatted text. The exceptions are web page body text and custom bubble text, which can include formatting and processing information in the form of <bpt>/<ept> processing tags. For the most part, these tags should be copied directly from the source text to the linguistically equivalent portion of the translated text.


In addition, in cases where the mode and/or output markup of the text within a bubble differs, multiple translation units are created for a single bubble. Because the IDs of the resulting translation units contain information on markup, these translation units should always be processed individually, even if they contain similar text. This preserves the appropriate bubble text markup upon import of the translated content. A translation memory should prove helpful in automating translation in such cases.


Table of Contents