This chapter presents an overview of the Site Capture application and the installation process.
This chapter contains the following topics:
Oracle WebCenter Sites: Site Capture is a web application that integrates with Oracle WebCenter Sites through the Oracle WebCenter Sites: Web Experience Management (WEM) Framework to capture dynamically published websites for evaluation, compliance purposes, high availability requirements, and other types of scenarios.
Crawls can be initiated manually from the Site Capture interface, or they can be triggered by the completion of a WebCenter Sites RealTime publishing session. In each scenario, the crawler captures the site in one of the following modes, depending on the user's selections:
Static mode: The site is stored as files that are ready to be served. Only the latest capture is kept.
Archive mode: The site is stored in a zip file. A pointer in the Site Capture database enables archive management from the Site Capture interface.
This guide contains procedures for installing and configuring Site Capture to support:
Static and archive capture initiated manually, from the Site Capture interface.
Static and archive capture triggered by the completion of a WebCenter Sites RealTime publishing session. Setting up publishing-triggered site capture is an option.
Administrative users and developers. The Site Capture application is designed for general administrators of the WebCenter Sites system on which the Site Capture application will be running. Developers will write advanced crawler configuration code – for example, code that triggers Site Capture to execute a post-crawl command such as copying statically captured sites to a web server's doc base.
Users of this guide must have experience installing and configuring enterprise-level software, such as application servers and databases. Also required is a general administrator's experience with WebCenter Sites and the WEM Admin interface.
To complete the procedures in this guide, you must be a WebCenter Sites general administrator who belongs to the
RESTAdmin security group.
Download the Oracle WebCenter Sites 11g Release 1 (11.1.1.x) Certification Matrix for information about supported operating systems, application servers, databases, and browsers. Also, download the Oracle WebCenter Sites release notes for information about Site Capture.
Read this guide to acquire an understanding of the Site Capture installation process. The basic steps are configuring the Site Capture application server, running the installer to create the Site Capture
war file, deploying Site Capture, completing post-installation and verification steps, and if necessary, setting up publishing-triggered site capture.
On all systems, set the
JAVA_HOME variable to point to a valid installation of JDK 1.6.
Prepare the Site Capture installation components:
Ensure you have a dedicated and fully functional Oracle WebCenter Sites installation running in development or content management mode. Site Capture must communicate with this WebCenter Sites system in order to run.
Ensure you have a dedicated application server on which to install the Site Capture application server and the application itself. During the installation process, you will configure Site Capture to communicate with the WebCenter Sites system (described above), which runs on its own host machine.
Decide whether your Site Capture application will be running as a single application or in clustered mode. For a diagram of a single-server installation, and clustered installation, see Figure 21-1.
Install or reuse the following components:
Install a supported Site Capture application server on the dedicated application server. For clustered installations, install the application server for each Site Capture instance.
For a clustered installation, install a load balancer on a host machine of your choice. The Site Capture installation directory must be a shared directory, accessible to all cluster members.
To store archived sites, you can either reuse WebCenter Sites' database or install a dedicated, supported Site Capture database on a host machine of your choice.
Decide whether you will be running publishing-triggered site capture. If so, you will need a WebCenter Sites source and target system:
The target system provides the REST API and WEM SSO API, which enable the target system to communicate with the Site Capture application at the end of the publishing session in order to start the required crawlers. The Site Capture application will then send crawler invocation status to the target WebCenter Sites system, which will, in turn, send the same information to the source WebCenter Sites system. Both WebCenter Sites systems record the status information in their own log files (
futuretense.txt, by default).
You will integrate the source and target WebCenter Sites systems with your Site Capture installation. For some of the possible configurations, see Figure 21-2, "Single-Server Installation Enabled for Publishing-Triggered Site Capture" and Figure 21-3, "Clustered Installation Enabled for Publishing-Triggered Site Capture".
You have the option to install Site Capture silently or graphically. The silent installer provides help and sample values for every piece of information that needs to be set.