49 Site Capture File System

The Site Capture file system is created during the Site Capture installation process to store installation-related files, property files, sample crawlers, and sample code used by the FirstSiteII crawler to control its site capture process. The file system also provides the framework in which Site Capture organizes custom crawlers and their captures.

This chapter contains the following topics:

Section 49.1, "General Directory Structure"
Section 49.2, "Custom Folders"

49.1 General Directory Structure

Figure 49-1 shows Site Capture's most frequently accessed folders to help administrators find commonly used Site Capture information. All folders, except for <crawlerName>, are created during the Site Capture installation process. For information about <crawlerName> folders, see Table 49-1, "Site Capture's Frequently Accessed Folders" and Section 49.2, "Custom Folders."

Figure 49-1 Site Capture File System

Description of ''Figure 49-1 Site Capture File System ''

Table 49-1 Site Capture's Frequently Accessed Folders

Folder	Description
`/fw-site-capture`	The parent folder.
`/fw-site-capture/crawler`	Contains all Site Capture crawlers, each stored in its own crawler-specific folder.
`/fw/site-capture/crawler/_sample`	Contains the source code for the FirstSiteII sample crawler. Note: Folder names beginning with the underscore character ("_") are not treated as crawlers. They are not displayed in the Site Capture interface.
`/fw-site-capture/crawler/Sample`	Represents a crawler named "Sample." This folder is created only if the "Sample" crawler was installed during the Site Capture installation process. The `Sample` folder contains an `/app` folder, which stores the `CrawlerConfiguration.groovy` file specific to the "Sample" crawler. The file contains basic configuration code for capturing any dynamic site. The code demonstrates the use of required methods (such as `getStartUri`) in the `BaseConfigurator` class. When the Sample crawler is invoked in static or archive mode, subfolders are created within the `/Sample` folder.
`/fw-site-capture/logs`	Contains the `crawler.log` file, a system log for Site Capture.
`/fw-site-capture/publish-listener`	Contains the following files needed for installing Site Capture for publishing-triggered crawls: `fw-crawler-publish-listener-1.1-elements.zip` `fw-crawler-publish-listener-1.1.jar`
`/fw-site-capture/Sql-Scripts`	Contains the following scripts, which create database tables that are needed by Site Capture to store its data: `crawler_db2_db.sql` `crawler_oracle_db.sql` `crawler_sql_server_db.sql`
`/fw-site-capture/webapps`	Contains the `ROOT/WEB-INF/` folder.
`/fw-site-capture/webapps/ROOT/WEB-INF`	Contains the `log4j.xml` file, used to customize the path to the `crawler.log` file.
`/fw-site-capture/webapps/ROOT/WEB-INF/classes`	Contains the following files: `sitecapture.properties` file, where you can specify information for the WebCenter Sites application on which Site Capture is running. The information includes WebCenter Sites' host machine name (or IP address) and port number. `root-context.xml` file, where you can configure the Site Capture database.

49.2 Custom Folders

A custom folder is created for every crawler that a user creates in the Site Capture interface. The custom folder, <crawlerName>, is used to organize the crawler's configuration file, captures, and logs, as summarized in Figure 49-2.

Figure 49-2 Site Capture's Custom Folders: <crawlerName>

Description of ''Figure 49-2 Site Capture's Custom Folders: <crawlerName>''