Content is the material you want to make available to be searched, and can take many forms. Examples include:

Note: For a list of supported content formats, see Appendix A, Indexable File Types.

Once you have identified the content you want to index, you can add it to your Search project.

Each Search project automatically has one content set when created. You can use additional content sets to organize your content if you want the option of updating portions of your index independently of each other, or allowing client software that uses Search to constrain user searches by content set.

ATG Search supports indexing content from either a file system or an ATG repository. Either type of content can in principle be considered to be either structured or unstructured. In practice, repository data is essentially always structured, and file system data is usually unstructured. File system data may be structured in the case of XHTML files (see the ATG Commerce Search Guide for information on the XHTML structure required by ATG Search).

When you index content as structured, the client application through which end-users access Search can constrain searches to specific text fields (“fielded” search). When using fielded search, a search is performed as usual, but only over the text contained in the specified fields; for example, a user might want to search only across product descriptions in a Commerce catalog, or only symptoms in a Service solution. Note that this differs from constraining by property, which is analogous to “search only items that have brand=BrandX”.

Structured content exists in the form of ATG Commerce catalog data and ATG Knowledge solutions. These ATG products generate XHTML files, which ATG Search then indexes in a way that makes fielded search possible. You may want to index file system content as structured if you have many such XHTML files outside of a repository. Otherwise, unstructured content usually consists of text documents in various forms.

Another practical effect is that in the index, unstructured content is rooted at the /Documents docset. All structured content is rooted at the /Solutions docset. This allows constraints to be applied at this highest level.

 
loading table of contents...