Endeca Control System directory structure

Before you start building your instance configuration, you must create a directory structure to support your data processing back end. The structure of the directory is dictated by the mechanism (i.e., Endeca Control System or the Endeca Application Controller) you have chosen to control your Endeca environment.

If you are using the Endeca Control System to control your environment, you will have to create a directory structure to contain source data, control scripts, system-generated files, log files, and so forth. The example below shows the directory structure used for the sample_wine_data reference implementation:

instance_root
	data
		forge_input
		incoming
		partition0
			dgidx_output
			dgraph_input
			forge_output
			state
	etc
	logs
	reports

The table below describes the contents of each directory:

Directory	Description
`instance_root`	Contains all required subdirectories for this instance of your Endeca implementation.
data	Contains subdirectories for your instance configuration, source data extracts, and system-generated files.
forge_input	Contains the baseline pipeline file (typically named `pipeline.epx`), the partial updates pipeline file (if you are running partial updates; the file is typically named `partial_pipeline.epx`), and the index configuration files (*.xml).
incoming	Contains data ready for processing by Forge. On a production site, the files in this directory may have been created by a data extraction process on the customer’s database or may be picked up from another FTP server.
partition	Contains subdirectories for system-generated files, such as Forge output, Dgidx output, and Dgraph input.
state	Contains any state information that must be saved between runs of the Data Foundry, for example, auto-generated dimension IDs.
forge_output	Contains data that has been processed by Forge and is ready for indexing.
dgidx_output	Contains indices that have been processed by Dgidx and output in MDEX Engine format.
dgraph_input	Contains a copy of the MDEX Engine indices stored in dgidx_output. When you start the MDEX Engine (Dgraph) process, you should point at this copy of the indices. Having a separate copy of the indices allows you to isolate your working MDEX Engine indices from those that are being updated.
etc	Contains system-level configuration for your Endeca implementation, such as control scripts.
logs	Contains log files generated by the various Endeca components.
reports	Contains any reports you choose to generate for your implementation.

While you can structure your directories in any way you want, Oracle recommends you mimic the directory structure of the sample_wine_data reference implementation in order to maximize reuse of code, configuration settings, and control scripts.

After creating your directory structure, you should:

Copy your source data extracts to instance_root/data/incoming.
Copy any control scripts you want to use or modify to the etc directory. You can find reference control scripts in the etc directory of the sample_wine_data reference implementation.