“Searching” means retrieving information using an input query. Commonly, the information is textual, it is retrieved from a collection of documents, and the queries are words or phrases entered by an end-user. The search results are typically those documents most relevant to the query, plus some indication of why the documents were retrieved. In order for searching to be efficient, the document collection is indexed by its terms in a secondary storage component, typically called the index.

ATG Search generalizes documents and other content into an abstraction called an index item. An index item can be an actual document such as an HTML file, a repository item such as an ATG Commerce product, or a piece of structured data from a database such as Microsoft Access. An index item consists of two elements: searchable text content and metadata. Metadata includes the title, summary, and other properties, and is used in the following Commerce Search features:

The index items are fed into ATG Search, which analyzes the content and stores a representation for each item in the index. The index objects are organized into hierarchical sets much like a directory scheme. ATG Search creates some sets from the physical organization of the items, some sets from the metadata of the items, and other sets from topic categorization results. These item sets enable users to search within subsets of the collection and browse the collection without query input.

Text content is processed through the natural language components, which identify the structural elements (such as sentences, headers, and tables) as well as the terms in the content. This processing is driven by a dictionary and other language data, which are also stored in the index. The terms are divided into statement vectors representing the sequence of terms in a structural portion of the content. The terms are also added to an index, similar to one found in the back of a large book. Thisback-of-the-book index allows for efficient searches and provides a global view of all the content.

Rather than simple text queries, ATG Search accepts complexrequests that specify what actions should be performed. A request may include parameters, processing options, constraints, security settings, and other information. The two primary requests are the Search Query and the View Item request. ATG Search returns responses that contain varied information depending on the type of request. For a Search Query, the response contains a list of results plus other information to drive the search application or user interface.

For more information on how ATG Search processes content into a searchable index, see the ATG Search Query Reference Guide.

 
loading table of contents...