ATG Search runs within a standard ATG platform installation. Due to the heavy performance demands of indexing content and serving search responses, you must dedicate at least one machine in your network exclusively to Search.
ATG Search has the following components:
Search Administration—The interface where you create projects, index content, deploy indexes, etc. Runs within a standard ATG installation. You should have only one Search Administration instance in your installation.
The
SearchAdmin.AdminUI
module is required to run Search AdministrationSearch engine—Serves answers to end-user queries. Does not require a full ATG installation. Your installation can include any number of search engines, but should run only one search engine per processor core. A single search engine can use up to 1.4GB of memory, assuming a 32-bit operating system. Your hardware should be sized accordingly.
The
DAF.Search.Routing
module starts the Search engine(s) either locally or remotely, and coordinates communication between the client application, Search engine(s), and the Search database. Runs within a standard ATG installation. You can run multiple instances of this module, but one should always run along with Search Administration.The
DAF.Search.Base
module is a lightweight routing module that can be run independent of Search Administration.DAF.Search.Base
is automatically included inRouting
and contains base routing functionality. It is also used stand-alone in client applications. You can run multiple instances of this module.Client application—The application through which your end-users place their search queries. This application must include the
DAF.Search.Base
module, and must also includeDAF.Search.Query
if your application uses legacy form handlers.Search index—The searchable content deployed on your site. An index is composed of one or more logical partitions, each of which is associated with a content set configured in Search Administration. Each logical partition is composed of one or more physical partitions, each of which is served up by a search engine.
Search database—Consists of two repositories, one of which stores information about Search engines, index structure, and deployment information, the other of which stores information about users and Search Administration. Search Administration requires access to both repositories. Both Search Administration and your client application must have access to the routing repository.
The Search Administration, routing, and Search engines must all have access to the deployment share directory. This is a scalable, shared directory where master copies of indexes are stored. This directory should ideally be located on a high-performance machine separate from the Search Administration machine. Routing and Search engine components must have access to this directory in order for indexes to be deployed and searched.
Note: The size of the index that is deployed does not bear any direct relationship to the size of the raw information being indexed—dictionaries, topics, and other customization data can all add to the size of the index, as can the nature of the content itself. For example, content consisting mostly of pictures with some metadata might form a very small index relative to the raw content size, while a product catalog with many small, unique pieces of information might be relatively large.
There are several possible configurations for the components described. The simplest option is to run all components locally, on one machine (the possible exception being the database). This configuration is sufficient for testing purposes and for estimating the size of your optimal configuration, but is not likely to be used in a production environment.
You may want to consider a self-contained installation to begin with. Use this installation to estimate the size of your index or indexes, then add routing and search engine installations as necessary. The original installation can then remain as the Search Administration/indexing machine. In a production environment, you should separate the searching and indexing functions. The diagram that follows shows a minimal configuration:
And of course ideally, resources will be available to provide redundancy in order to prevent performance bottlenecks and failures should one component go down. Additionally, if your content set is large enough to require multiple search Engines, you will need multiple CPUs in order to serve the index. The next diagram shows a single CPU dedicated to indexing, and a minimum of three for searching.