AquaLogic Interaction Administrator Guide

     Previous Next  Open TOC in new window   View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

About Search Service Internationalization Support

The portal provides support for 61 languages. The portal uses Unicode characters to store and retrieve text, and the system has access to linguistic rules for multiple languages during full-text indexing. This makes it possible to have documents of different languages within the same search collection, with significantly improved results. The user interface handles text using the UTF-8 encoding, so search results are always displayed correctly, assuming that the appropriate fonts are available to the web browser.

Some languages supported by the portal include support for word stemming and compound decomposition. This additional information is used to enhance results of the full-text index. For a list of supported languages, including which have enhanced support, see Search Service Language Support.

Crawling International Document Repositories

Web and file content crawlers are associated with a specific language. All documents processed by a content crawler are indexed using the linguistic rules appropriate for the specified language. For optimal results, create a separate content crawler to handle documents of different languages. For most European languages, mixing languages within a single crawl will not render the content unsearchable; however, word stemming and decomposition information stored in the documents will be missing for languages other than the content crawler's designated language. Avoid indexing Asian language documents with a content crawler configured for a European language, as special tokenization rules are required for processing the Asian languages.

Submitting International Documents to the Knowledge Directory

When you use the Submit Document utility to add documents to the Knowledge Directory, you specify the document language by choosing from a pop-up list of the supported languages. As with content crawlers, this language should be set to the actual language of the document for optimal results. Correct specification of language is particularly crucial for proper indexing of Asian language content.


  Back to Top      Previous Next