The bulk loader and the incremental loader generate the XHTML representations of the data. If loading is initiated from Search Administration, the loaders use a special document submitter (created automatically) to submit the documents to Search Routing, which writes batches of documents to disk as command files during the index estimation process. These command files are then executed to stream the XHTML data to the search engine for indexing.
You can initiate loading from the Dynamo Server Admin, to help optimize and debug your XHTML output. In this case, XHTML documents are not submitted to Routing, and no indexing job is initiated. Instead, the documents are directed either to the console or to files, depending on the value of the IndexingOutputConfig
component’s documentSubmitter
property:
To direct the documents to files, set
documentSubmitter
to a component of classatg.repository.search.indexing.submitter.FileDocumentSubmitter
.To direct the documents to the console, set
documentSubmitter
to a component of classatg.repository.search.indexing.submitter.ConsoleDocumentSubmitter
, or just leave the property null, and aConsoleDocumentSubmitter
will be created automatically.
To initiate loading from Dynamo Server Admin:
Access the page for the
IndexingOutputConfig
component in the Component Browser of the Dynamo Server Admin.In the Methods section, click the link for
bulkLoadForDiagnostics
.At the confirmation prompt, click Invoke Method.
Note that for a multisite application, the generated documents will reflect all enabled sites.
Look over the XHTML output carefully to identify potential improvements. For example, if you see redundant values for certain properties, you can specify the unique
filter in the XML definition file for those properties (see Using Property Value Filters). Or if you notice that certain text properties have values that users are unlikely to search for, you can remove those properties from the definition file.
Configuring a FileDocumentSubmitter Component
If documentSubmitter
is set to a component of class FileDocumentSubmitter
, a separate file is created for each XHTML document generated. The location and names of the files are automatically determined based on the following properties:
baseDirectory
The pathname of the directory to write the files to.
filePrefix
The string to prepend to the name of each generated file. Default is the empty string.
fileSuffix
The string to append to the name of each generated file. Default is “.xhtml
”.
nameByRepositoryId
Iftrue
, each filename will be based on the repository ID of the item the file represents. Iffalse
(the default), files are named0.xhtml
,1.xhtml
, etc.
overwriteExistingFiles
Iftrue
, if the generated filename matches an existing file, the existing file will be overwritten by the new file. Iffalse
(the default), the new file will be given a different name to avoid overwriting the existing file.
Viewing the Output in the Component Browser
You can view the XHTML output for a single document-level repository item (and its child items) in the Component Browser of Dynamo Server Admin. To do this, access the page for the IndexingOutputConfig
component in the Component Browser. The page will include a Test Document Generation section that looks similar to this:
Fill in the repository ID, and click Generate. The page will display the XHTML output. The output is not directed to Routing, and no indexing job is initiated.
Note that you can also view XHTML output in ATG Search Administration. See the Content Inspection section.